Microorganisms and assays for the identification of antibiotics

ABSTRACT

The present invention features methods for the identification of compounds and compositions useful as antibiotics and antibacterial agents. In particular, the invention features methods for the identification of modulators of a previously unidentified target protein, termed CoaX. High-throughput assay systems are featured as well as assay kits for the identification of CoaX modulators. Also featured are coaX nucleic acid molecules and purified CoaX proteins, as well as recombinant vectors and microorganisms including the gene, coaX.

RELATED APPLICATIONS

[0001] The instant application claims the benefit of prior filedprovisional U.S. patent application Ser. No. 60/227,860, entitled “NovelMicrobial Pantothenate Kinase Gene and Methods of Use”, filed Aug. 24,2000. The instant application is also related to U.S. patent applicationSer. No. 09/667,569, entitled “Methods and Microorganisms for Productionof Panto-Compounds”, filed Sep. 21, 2000 (pending). The entire contentof the above-referenced patent applications is incorporated herein bythis reference.

BACKGROUND OF THE INVENTION

[0002] Antimicrobial or antibiotic treatment is a well-accepted therapyfor fighting microbial infections that takes advantage of the existenceof biological processes that are unique to bacteria or fungi, that canbe safely inhibited to the detriment of the bacteria, without producingundesired or harmful side effects in the individual receiving suchtherapy. However, due at least in part to the continual evolution ofmicrobial resistance to the available classes of antibiotics, and inpart to the recent slowdown in the introduction of novel antimicrobialsto market, there exists a need for the development of screening assaysthat target previously unexploited biochemical systems in microbes. Inparticular, there exists the need for the identification of newbacterial targets for use in drug discover programs designed to identifyagents having potential use as anti-infective agents with novel modes ofactions.

SUMMARY OF THE INVENTION

[0003] The present invention is based at least in part, on theidentification of a novel target for use in screening assays designed toidentify antimicrobial agents. In particular, the present invention isbased on the identification and characterization of a previouslyunidentified microbial pantothenate kinase gene, coax. The coaX gene wasfirst identified in B. subtilis where it is one of two genes encodingfunctional pantothenate kinase. Initially the present inventorsidentified and cloned the B. subtilis coaA gene (previously termed yqjS)that encodes a pantothenate kinase homologous to the CoaA enzymepreviously characterized in E. coli. A second gene (previously termedyacB) has also been identified and cloned by the present inventors thatis not homologous to any previously described pantothenate kinase. Thislatter pantothenate kinase-encoding gene has been renamed coaX. The coaxgene could be deleted from B. subtilis strains with an intact coaA gene,but it could not be deleted from a strain containing a deletion in thecoaA gene, indicating that the coaX gene is not essential in B. subtilisstrains with a wild-type coaA gene. Homologs of the coaX gene can befound in a number of bacterial species, including but not limited toAquifex aeolicus, Bacillus anthracis, Bacillus halodurans, Bacillusstearothermophilus, Caulobacter crescentus, Chlorobium tepidum,Clostridium acetobutylicum, Dehalococcoides ethenogenes, Deinococcusradiodurans, Desulfovibrio vulgaris, Geobacter sulfurreducens,Pseudomonas putida, Rhodobacter capsulatus, Thiobacillus ferrooxidans,Streptomyces coelicolor, Synechocystis sp., Thermotoga maritima,Bordetella pertussis, Borrelia burgdorferi, Campylobacter jejuni,Clostridium difficile, Helicobacter pylori, Neisseria meningitidis,Neisseria gonorrhoeae, Porphyromonas gingivalis, Pseudomonas aeruginosa,Pseudomonas syringae pv tomato, Treponema pallidum, Xylella fastidiosaand Mycobacterium tuberculosis. More importantly, however, this novelpantothenate kinase gene has been found to be the sole essentialpantothenate kinase in troublesome pathogens including, but not limitedto, Bordetella pertussis, Borrelia burgdorferi, Campylobacter jejuni,Helicobacter pylori, Neisseria meningitidis, Pseudomonas aeruginosa,Treponema pallidum and Xylella fastidiosa. Accordingly, the coaX generepresents an attractive target for screening for new antibacterialcompounds to combat these pathogenic microorganisms, particularlymicroorganisms in which coaX is the sole pantothenate kinase-encodinggene.

[0004] Accordingly, the present invention features isolated CoaXproteins, in particular, proteins encoded by the coax gene in bacteria.The invention also features isolated nucleic acid molecules and/orgenes, e.g., bacterial nucleic acid molecules and/or genes, inparticular, isolated bacterial coaX nucleic acid molecules and/or genes.Also featured are vectors that contain isolated coaX nucleic acidmolecules and/or genes as well as mutant coaX nucleic acid moleculesand/or genes. Also featured are recombinant microorganisms (e.g.,microorganisms belonging to the genus Escherchia or Bacillus, forexample, E. coli or B. subtilis) containing isolated coaX nucleic acidmolecules and/or genes or mutant coaX nucleic acid molecules and/orgenes of the present invention. In particular, the invention featuresrecombinant microorganisms that produce the CoaX proteins of the presentinvention, e.g., pantohthenate kinase proteins encodes by the coaXnucleic acid molecules and/or genes of the present invention.

[0005] Also featured are methods for identifying CoaX modulatorsutilizing, for example, isolated CoaX proteins of the present inventionor recombinant microorganisms expressing the CoaX proteins of thepresent invention.

[0006] Other features and advantages of the invention will be apparentfrom the following detailed description and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007]FIG. 1 is a schematic representation of the Coenzyme Abiosynthetic pathway in E. coli.

[0008]FIG. 2 is a schematic representation of the structure of theBacillus subtilis genome in the region of the coaA gene. The scale is inbase pairs and the significant open reading frames are shown by openarrows.

[0009]FIG. 3 is a schematic representation of the structure of pAN296, aplasmid designed to delete most of the B. subtilis coaA gene andsubstitute a chloramphenicol resistance gene.

[0010]FIG. 4 is a schematic representation of the structure of theBacillus subtilis genome in the region of the coaX (yacB) gene. Thescale is in base pairs, the significant open reading frames are shown byopen arrows and certain predicted restriction fragments are indicated bythick bars.

[0011]FIG. 5 is a schematic representation of the structure of pAN341and pAN³⁴², two independent PCR-derived clones of B. subtilis yacB(renamed herein as coaX).

[0012]FIG. 6A-D depicts a multiple sequence alignment (MSA) of the aminoacid sequences encoded by fourteen known or predicted microbial coaXgenes. SEQ ID NOs:2-15 correspond to the amino acid sequences ofBacillus subtilis (SwissProt™ Accession No. P37564), Clostridiumacetobulyticum (WIT™ Accession No. RCA03301, Argonne NationalLaboratories), Streptomyces coelicolor (PIR™ Accession No. T36391),Mycobacterium tuberculosis (SwissProt™ Accession No. 006282),Rhodobacter capsulatus (WIT™ Accession No. RRC02473), Desulfovibriovulgaris (DBJ™ Accession No. BAA21476.1), Deinococcus radiodurans(SwissProt™ Accession No. Q9RX54), Thermotoga maritima (GenBank™Accession No. AAD35964.1), Treponema pallidum (SwissProt™ Accession No.083446), Borrelia burgdorferi (SwissProt™ Accession No.051477), Aquifexaeolicus (SwissProt™ Accession No. 067753), Synechocystis sp.(SwissProt™ Accession No. P74045), Helicobacter pylori (SwissProt™Accession No. 025533), and Bordetella pertussis (SwissProt™ AccessionNo. Q45338), respectively. The alignment was generated using ClustalWMSA software at the GenomeNet CLUSTALW Server at the Institute forChemical Research, Kyoto University. The following parameters were used:Pairwise Alignment, K-tuple (word) size=1, Window size=5, Gap Penalty=3,Number of Top Diagonals=5, Scoring Method=Percent; Multiple Alignment,Gap Open Penalty=10, Gap Extension Penalty=0.0, Weight Transition=No,Hydrophilic residues=Gly, Pro, Ser, Asn, Asp, Gln, Glu, Arg and Lys,Hydrophobic Gaps=Yes; and Scoring Matrix=BLOSUM.

[0013]FIG. 7 is a schematic representation of the structure of pAN336, aplasmid designed to delete B. subtilis coaX from its chromosomal locusand replace it with a kanamycin resistence gene.

[0014]FIG. 8 is a schematic representation of the construction ofpOTP72, a plasmid containing the H. pylori coaX gene.

[0015]FIG. 9 is a schematic representation of the construction ofpOTP73, a plasmid containing the P. aeruginosa coaX gene.

[0016]FIG. 10 is a schematic representation of the construction ofpOTP71, a plasmid containing the B. subtilis coaX gene.

DETAILED DESCRIPTION OF THE INVENTION

[0017] The present invention is based at least in part, on theidentification of a novel target for use in screening assays designed toidentify antimicrobial agents. In particular, the present invention isbased on the identification and characterization of a previouslyunidentified microbial pantothenate kinase. This pantothenate kinase,encoded by a gene, termed coaXherein, is structurally unrelated to thepreviously characterized E. coli pantothenate gene, coaA, however, bothgenes encode functional pantothenate kinase enzymes, pantothenate kinasebeing essential for the synthesis of Coenzyme A (CoA). CoA is anessential coenzyme in all cells, participating in over 100 differentintermediary reactions in cellular metabolism including, but not limitedto, the tricarboxylic acid (TCA) cycle, fatty acid metabolism, vitaminbiosynthesis and numerous other reactions of intermediary metabolism.Accordingly, pantothenate kinase production is essential for microbialgrowth. Coenzyme A (CoA) is synthesized in both eukaryotes andprokaryotes from pantothenate, also known as pantothenic acid or vitaminB5. The initial (and possibly rate-controlling) step in the conversionof pantothenate to Coenzyme A (CoA) is phosphorylation of pantothenateby pantothenate kinase. A schematic representation of the pathwayleading to CoaA biosynthesis in E. coli, i. e., the E. coli CoAbiosynthetic pathway is set forth as FIG. 1. The term “CoA biosyntheticpathway”, as used herein, includes the biosynthetic pathway involvingCoA biosynthetic enzymes (e.g., polypeptides encoded by biosyntheticenzyme-encoding genes), compounds (e.g., precursors, substrates,intermediates or products), cofactors and the like utilized in theformation or synthesis of CoA from pantothenate. The CoA biosyntheticpathway depicted is also presumed to be that utilized by othermicroorganisms. The term “CoA biosynthetic pathway” includes thebiosynthetic pathway leading to the synthesis of CoA in microorganisms(e.g., in vivo) as well as the biosynthetic pathway leading to thesynthesis of CoA in vitro.

[0018] The term “Coenzyme A or CoA biosynthetic enzyme” includes anyenzyme utilized in the formation of a compound (e.g., intermediate orproduct) of the CoA biosynthetic pathway, for example, the coaA, panK orcoaX gene product which catalyzes the phosphorylation of pantothenate toform 4′-phosphopantothenate, or the coaD gene product which catalyzesthe conversion of 4′-phosphopantetheine to dephosphocoenzyme A.

[0019] The coaX gene was first identified in B. subtilis, amicroorganism in which it is one of two pantothenate kinase-encodinggenes. Initially, the present inventors identified and cloned the B.subtilis coaA gene (previously termed yqjS) that encodes a pantothenatekinase homologous to the CoaA enzyme previously characterized in E.coli. A second gene (previously termed yacB) has also been identifiedand cloned by the present inventors that is not homologous to anypreviously described pantothenate kinase. This latter pantothenatekinase-encoding gene has been renamed coaX. The coaX gene could bedeleted from B. subtilis strains with an intact coaA gene, but it couldnot be deleted from a strain containing a deletion in the coaA gene,indicating that the coaX gene is not essential in B. subtilis strainswith a wild-type coaA gene.

[0020] Homologs of the coaX gene can be found in a number of bacterialspecies, including but not limited to Aquifex aeolicus, Bacillusanthracis, Bacillus halodurans, Bacillus stearothermophilus, Caulobactercrescentus, Chlorobium tepidum, Clostridium acetobutylicum,Dehalococcoides ethenogenes, Deinococcus radiodurans, Desulfovibriovulgaris, Geobacter sulfurreducens, Pseudomonas putida, Rhodobactercapsulatus, Thiobacillus ferrooxidans, Streptomyces coelicolor,Synechocystis sp., Thermotoga maritima, Bordetella pertussis, Borreliaburgdorferi, Campylobacter jejuni, Clostridium difficile, Helicobacterpylori, Neisseria meningitidis, Neisseria gonorrhoeae, Porphyromonasgingivalis, Pseudomonas aeruginosa, Legionella pneumophila, Treponemapallidum, Xylella fastidiosa and Mycobacterium tuberculosis. Moreimportantly, however, this novel pantothenate kinase gene has been foundto be the sole essential pantothenate kinase in troublesome pathogensincluding, but not limited to, Bordetella pertussis, Borreliaburgdorferi, Campylobacter jejuni, Helicobacter pylori, Neisseriameningitidis, Pseudomonas aeruginosa, Treponema pallidum and Xylellafastidiosa. Accordingly, the coaX gene represents an attractive targetfor screening for new antibacterial compounds to combat these pathogenicmicroorganisms, particularly microorganisms in which coaX is the solepantothenate kinase-encoding gene.

[0021] Accordingly, in one aspect the present invention features assaysfor the identification an antibiotic that involve contacting acomposition comprising a CoaX protein with a test compound; anddetermining the ability of the test compound to inhibit the activity ofthe CoaX protein; wherein the compound is identified as an antibioticbased on the ability of the compound to inhibit the activity of the CoaXprotein. In another aspect, the invention features an assay for theidentification a potential antibiotic that involves contacting an assaycomposition comprising CoaX with a test compound; and determining theability of the test compound to bind to the CoaX; wherein the compoundis identified as a potential antibiotic based on the ability of thecompound to bind to the CoaX. In a preferred assay format, thecomposition is also contacted with pantothenate or a pantothenate analogand activity determined.

[0022] In another aspect, the invention features methods for identifyingpantothenate kinase modulators that involve contacting a recombinantcell expressing a single pantothenate kinase encoded by a coaX gene witha test compound and determining the ability of the test compound tomodulate pantothenate kinase activity in said cell. In another aspect,the invention features methods for identifying pantothenate kinasemodulators that involve contacting a recombinant cell expressing a firstand second pantothenate kinase, with a test compound and determining theability of the test compound to modulate pantothenate kinase activity insaid cell, wherein the first or second pantothenate kinase has reducedactivity. Preferred recombinant microorganisms are of the genus Bacillusor Escherchia (e.g., Bacillus subtilis or Escherchia coli).

[0023] Also featured are isolated nucleic acid molecules that include acoaX gene of the present invention, isolated proteins encoded by thecoaX genes of the present invention and biologically active portionsthereof. In one embodiment, the invention features a coaX gene derivedfrom a microorganism selected from the group consisting of Aquifexaeolicus, Bacillus anthracis, Bacillus halodurans, Bacillusstearothermophilus, Bacillus subtilis, Caulobacter crescentus,Chlorobium tepidum, Clostridium acetobutylicum, Dehalococcoidesethenogenes, Deinococcus radiodurans, Desulfovibrio vulgaris, Geobactersulfurreducens, Pseudomonas putida, Rhodobacter capsulatus, Thiobacillusferrooxidans, Streptomyces coelicolor, Synechocystis sp., Thermotogamaritima, Bordetella pertussis, Borrelia burgdorferi, Campylobacterjejuni, Clostridium difficile, Helicobacter pylori, Neisseriameningitidis, Neisseria gonorrhoeae, Porphyromonas gingivalis,Pseudomonas aeruginosa, Legionella pneumophila, Treponema pallidum,Xylella fastidiosa and Mycobacterium tuberculosis, or a protein encodedby said coaX gene.

[0024] In another embodiment, the invention features isolated nucleicacid molecules that include a coaX gene derived from a pathogenicbacterium selected from the group consisting of Bacillus anthracis,Bordetella pertussis, Borrelia burgdorferi, Campylobacter jejuni,Clostridium difficile, Helicobacter pylori, Neisseria meningitidis,Neisseria gonorrhoeae, Pseudomonas aeruginosa, Porphyromonas gingivalis,Legionella pneumophila, Treponema pallidum and Xylella fastidiosa, or aprotein encoded by said coaX gene. In a preferred embodiment, theinvention features isolated nucleic acid molecules that include a coaXgene derived from a pathogenic bacterium selected from the groupconsisting of Bordetella pertussis, Borrelia burgdorferi, Campylobacterjejuni, Helicobacter pylori, Neisseria meningitidis, Pseudomonasaeruginosa, Treponema pallidum and Xylella fastidiosa, or a proteinencoded by said coaX gene.

[0025] Also featured are recombinant vectors that include the isolatedcoaX genes of the present invention and recombinant microorganisms thatinclude said vectors.

[0026] I. General Background

[0027] A pantothenate kinase activity was first identified in Salmonellatyphimurium by screening for temperature-sensitive mutants whichsynthesized CoA at permissive temperatures but excreted pantothenate atnon-permissive temperatures. The mutations were mapped in the Salmonellachromosome and the genetic locus was designated coaA. The gene encodesthe enzyme that catalyzes the first step in the biosynthesis of coenzymeA from pantothenate (Dunn and Snell (1979) J. Bacteriol. 140:805-808).Escherichia coli temperature sensitive mutants have also been isolatedand characterized (Vallari and Rock (1987) J. Bacteriol. 169:5795-5800).These mutants (named coaA15(Ts)) are defective in the conversion ofpantothenate to CoA and further exhibit a temperature-sensitive growthphenotype, indicating that pantothenate kinase activity is essential forgrowth. Moreover, it was noted that CoA inhibited pantothenate kinaseactivity to the same degree in the mutant as compared to the wild-typeenzyme.

[0028] Feedback resistant E. coli mutants (named coaA16(Fr)) have alsobeen isolated that possess a pantothenate kinase activity that isrefractory to feedback inhibition by CoA (Vallari and Jackowski (1988)J. Bacteriol. 170:3961-3966). The mutation responsible for the reversionis, suprisingly, not genetically linked to the coaA gene bytransduction. Additional data described therein support the view thatthe total cellular CoA content is controlled by both modulation ofbiosynthesis at the pantothenate kinase step and possibly by degradationof CoA to 4′-phosphopantetheine.

[0029] The wild-type E. coli coaA gene was cloned by functionalcomplementation of E. coli temperature-sensitive mutants. The sequenceof the wild-type gene was determined (Song and Jackowski (1992) J.Bacteriol. 174:6411-6417 and Flamm et al. (1988) Gene (Amst.)74:555-558). Strains containing multiple copies of the coaA genepossessed 76-fold higher specific activity of pantothenate kinase,however, there was only a 2.7-fold increase in the steady state level ofCoA (Song and Jackowski, supra). It has further been reported that theprokaryotic enzyme (encoded by coaA in E. coli and a variety of othermicroorganisms) is feedback inhibited by CoA both in vivo and in vitrowith CoA being about five times more potent than acetyl-CoA ininhibiting the enzyme (Song and Jackowski, supra and Vallari et al.,supra). These data further support the view that feedback inhibition ofpantothenate kinase activity is a critical factor controllingintracellular CoA concentration. The E. coli CoaA protein has beencrystalized and the structure solved (Yun et al. (2000) J. Biol. Chem.275(36):28093-28099).

[0030] Using standard search and alignment tools, coaA homologues havebeen identified in Hemophilus influenzae, Mycobacterium tuberculosis,Vibrio cholerae, Streptococcus pyogenes and Bacillus subtilis. Bycontrast, proteins with significant similarity could not be identifiedin eukaryotic cells including Saccharomyces cerevisiae or in mammalianexpressed sequence tag (EST) databases. Using a genetic selectionstrategy, a cDNA encoding pantothenate kinase activity has recently beenidentified from Aspergillus nidulans (Calder et al. (1999) J. Biol.Chem. 274:2014-2020). The eukaryotic pantothenate kinase gene (panK) hasdistinct primary structure and unique regulatory properties that clearlydistinguish it from its prokaryotic counterpart. A mammalianpantothenate kinase gene (panK1α) has also been isolated which encodes aprotein having homology to the A. nidulans PanK protein and to thepredicted gene product of GenBank™ Accession Number 927798 identified inthe S. cerevisiae genome (Rock et al. (2000) J. Biol. Chem.275:1377-1383).

[0031] II. Coax Nucleic Acid Molecules

[0032] The present invention relates, at least in part, to theidentification of a novel microbial pantothenate kinase encoding gene,coaX, that is structurally distinct from a previously identifiedmicrobial pantothenate kinase encoding gene, coaA. Accordingly, oneaspect of the present invention features isolated coaX nucleic acidmolecules and/or genes useful, for example, for encoding pantothenatekinase enzymes for use in screening assays.

[0033] The term “nucleic acid molecule” includes DNA molecules (e.g.,linear, circular, cDNA or chromosomal DNA) and RNA molecules (e.g.,tRNA, rRNA, mRNA) and analogs of the DNA or RNA generated usingnucleotide analogs. The nucleic acid molecule can be single-stranded ordouble-stranded, but preferably is double-stranded DNA. The term“isolated” nucleic acid molecule includes a nucleic acid molecule thatis free of sequences that naturally flank the nucleic acid molecule(i.e., sequences located at the 5′ and 3′ ends of the nucleic acidmolecule) in the chromosomal DNA of the organism from which the nucleicacid is derived. In various embodiments, an isolated nucleic acidmolecule can contain less than about 10 kb, 5 kb, 4 kb, 3 kb, 2 kb, 1kb, 0.5 kb, 0.1 kb, 50 bp, 25 bp or 10 bp of nucleotide sequences whichnaturally flank the nucleic acid molecule in chromosomal DNA of themicroorganism from which the nucleic acid molecule is derived. Moreover,an “isolated” nucleic acid molecule, such as a cDNA molecule, can besubstantially free of other cellular materials when produced byrecombinant techniques, or substantially free of chemical precursors orother chemicals when chemically synthesized.

[0034] The term “gene”, as used herein, includes a nucleic acid molecule(e.g., a DNA molecule or segment thereof), for example, a protein orRNA-encoding nucleic acid molecule, that in an organism, is separatedfrom another gene or other genes, by intergenic DNA (i.e., interveningor spacer DNA which naturally flanks the gene and/or separates genes inthe chromosomal DNA of the organism). A gene may direct synthesis of anenzyme or other protein molecule (e.g., may comprise coding sequences,for example, a contiguous open reading frame (ORF) which encodes aprotein) or may itself be functional in the organism. A gene in anorganism, may be clustered in an operon, as defined herein, said operonbeing separated from other genes and/or operons by the intergenic DNA.Individual genes contained within an operon may overlap withoutintergenic DNA between said individual genes. An “isolated gene”, asused herein, includes a gene which is essentially free of sequenceswhich naturally flank the gene in the chromosomal DNA of the organismfrom which the gene is derived (i.e., is free of adjacent codingsequences which encode a second or distinct protein or RNA molecule,adjacent structural sequences or the like) and optionally includes 5′and 3′ regulatory sequences, for example promoter sequences and/orterminator sequences. In one embodiment, an isolated gene includespredominantly coding sequences for a protein (e.g., sequences whichencode Bacillus proteins). In another embodiment, an isolated geneincludes coding sequences for a protein (e.g., for a Bacillus protein)and adjacent 5′ and/or 3′ regulatory sequences from the chromosomal DNAof the organism from which the gene is derived (e.g., adjacent 5′ and/or3′ Bacillus regulatory sequences). Preferably, an isolated gene containsless than about 10 kb, 5 kb, 2 kb, 1 kb, 0.5 kb, 0.2 kb, 0.1 kb, 50 bp,25 bp or 10 bp of nucleotide sequences which naturally flank the gene inthe chromosomal DNA of the organism from which the gene is derived.

[0035] In one embodiment, an isolated nucleic acid molecule is orincludes a coaX gene. In another embodiment, an isolated nucleic acidmolecule is or includes a portion or fragment of a coaX gene. In oneembodiment, an isolated coaX nucleic acid molecule is derived from amicroorganism selected form the group consisting of Aquifex aeolicus,Bacillus anthracis, Bacillus halodurans, Bacillus stearothermophilus,Bacillus subtilis, Caulobacter crescentus, Chlorobium tepidum,Clostridium acetobutylicum, Dehalococcoides ethenogenes, Deinococcusradiodurans, Desulfovibrio vulgaris, Geobacter sulfurreducens,Pseudomonas putida, Rhodobacter capsulatus, Thiobacillus ferrooxidans,Streptomyces coelicolor, Synechocystis sp., Thermotoga maritima,Bordetella pertussis, Borrelia burgdorferi, Campylobacter jejuni,Clostridium difficile, Helicobacter pylori, Neisseria meningitidis,Neisseria gonorrhoeae, Porphyromonas gingivalis, Pseudomonas aeruginosa,Pseudomonas syringae pv tomato, Treponema pallidum, Xylella fastidiosa,Legionella pneumophila and Mycobacterium tuberculosis. In anotherembodiment, an isolated coaX nucleic acid molecule is derived from amicroorganism selected from the group consisting of Bacillus anthracis,Bordetella pertussis, Borrelia burgdorferi, Campylobacter jejuni,Clostridium difficile, Helicobacter pylori, Neisseria meningitidis,Neisseria gonorrhoeae, Porphyromonas gingivalis, Pseudomonas aeruginosa,Treponema pallidum, Xylella fastidiosa and Legionella pneumophila. Inanother embodiment, an isolated coaX nucleic acid molecule is derivedfrom a microorganism selected from the group consisting of Bordetellapertussis, Borrelia burgdorferi, Campylobacter jejuni, Clostridiumdifficile, Helicobacter pylori, Neisseria meningitidis, Pseudomonasaeruginosa, Treponema pallidum and Xylella fastidiosa. In anotherembodiment, an isolated coaX nucleic acid molecule or gene comprises anucleotide sequence set forth as any one of SEQ ID NOs:SEQ ID NO:32, SEQID NO:69, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:1, SEQ IDNO:38, SEQ ID NO:31, SEQ ID NO:36, SEQ ID NO:52, SEQ ID NO:54, SEQ IDNO:23, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:28, SEQ ID NO:60, SEQ IDNO:27, SEQ ID NO:34 or SEQ ID NO:68, SEQ ID NO:25, SEQ ID NO:40, SEQ IDNO:44, SEQ ID NO:42, SEQ ID NO:35 or SEQ ID NO:37, SEQ ID NO:62, SEQ IDNO:26, SEQ ID NO:24, SEQ ID NO:33, SEQ ID NO:29, SEQ ID NO:64, SEQ IDNO:30 and SEQ ID NO:66. In another embodiment, an isolated nucleic acidmolecule of the present invention comprises a nucleotide sequence whichis at least about 50-55%, preferably at least about 60-65%, morepreferably at least about 70-75%, more preferably at least about 80-85%,and even more preferably at least about 90-95% or more identical to anucleotide sequence set forth as any one of SEQ ID NOs:SEQ ID NO:32, SEQID NO:69, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:1, SEQ IDNO:38, SEQ ID NO:31, SEQ ID NO:36, SEQ ID NO:52, SEQ ID NO:54, SEQ IDNO:23, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:28, SEQ ID NO:60, SEQ IDNO:27, SEQ ID NO:34 or SEQ ID NO:68, SEQ ID NO:25, SEQ ID NO:40, SEQ IDNO:44, SEQ ID NO:42, SEQ ID NO:35 or SEQ ID NO:37, SEQ ID NO:62, SEQ IDNO:26, SEQ ID NO:24, SEQ ID NO:33, SEQ ID NO:29, SEQ ID NO:64, SEQ IDNO:30 and SEQ ID NO:66.

[0036] In yet another embodiment, an isolated coaX nucleic acid moleculeor gene comprises a nucleotide sequence that encodes a protein having anamino acid sequence as set forth in any one of SEQ ID NOs:SEQ ID NO:12,SEQ ID NO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:2, SEQID NO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ IDNO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ IDNO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ IDNO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ IDNO:65 and SEQ ID NO:5. In yet another embodiment, an isolated coaXnucleic acid molecule or gene encodes a homologue of the CoaX proteinshaving the amino acid sequences of SEQ ID NOs:SEQ ID NO:12, SEQ IDNO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:2, SEQ IDNO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ IDNO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ IDNO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ IDNO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ IDNO:65 and SEQ ID NO:5. As used herein, the term “homologue” includes aprotein or polypeptide sharing at least about 30-35%, preferably atleast about 35-40%, more preferably at least about 40-50%, and even morepreferably at least about 60%, 70%, 80%, 90% or more identity with theamino acid sequence of a wild-type protein or polypeptide describedherein and having a substantially equivalent functional or biologicalactivity as said wild-type protein or polypeptide. For example, a CoaXhomologue shares at least about 30-35%, preferably at least about35-40%, more preferably at least about 40-50%, and even more preferablyat least about 60%, 70%, 80%, 90% or more identity with any one of theproteins having the amino acid sequences set forth as SEQ ID NOs:SEQ IDNO:12, SEQ ID NO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ IDNO:2, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ IDNO:8, SEQ ID NO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ IDNO:63, SEQ ID NO:4, SEQ ID NO:13, SEQ ID NO:69, SEQ ID NO:15, SEQ IDNO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ IDNO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ IDNO:10, SEQ ID NO:65 and SEQ ID NO:5 and has a substantially equivalentfunctional or biological activity (i.e., is a functional equivalent) ofthe proteins having the amino acid sequences set forth as SEQ ID NOs:SEQID NO:12, SEQ ID NO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ IDNO:2, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ IDNO:8, SEQ ID NO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ IDNO:63, SEQ ID NO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ IDNO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ IDNO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ IDNO:10, SEQ ID NO:65 and SEQ ID NO:5 (e.g., has a substantiallyequivalent CoaX activity). In a preferred embodiment, an isolated coaXnucleic acid molecule or gene comprises a nucleotide sequence thatencodes a polypeptide as set forth in any one of SEQ ID NOs:SEQ IDNO:12, SEQ ID NO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ IDNO:2, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ IDNO:8, SEQ ID NO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ IDNO:63, SEQ ID NO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ IDNO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ IDNO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ IDNO:10, SEQ ID NO:65 and SEQ ID NO:5.

[0037] In another embodiment, an isolated coaX nucleic acid moleculehybridizes to all or a portion of a nucleic acid molecule having thenucleotide sequence set forth in any one of SEQ ID NOs:SEQ ID NO:32, SEQID NO:69, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:1, SEQ IDNO:38, SEQ ID NO:31, SEQ ID NO:36, SEQ ID NO:52, SEQ ID NO:54, SEQ IDNO:23, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:28, SEQ ID NO:60, SEQ IDNO:27, SEQ ID NO:34 or SEQ ID NO:68, SEQ ID NO:25, SEQ ID NO:40, SEQ IDNO:44, SEQ ID NO:42, SEQ ID NO:35 or SEQ ID NO:37, SEQ ID NO:62, SEQ IDNO:26, SEQ ID NO:24, SEQ ID NO:33, SEQ ID NO:29, SEQ ID NO:64, SEQ IDNO:30 and SEQ ID NO:66 or hybridizes to all or a portion of a nucleicacid molecule having a nucleotide sequence that encodes a polypeptidehaving the amino acid sequence of any of SEQ ID NOs:SEQ ID NO:12, SEQ IDNO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:2, SEQ IDNO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ IDNO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ IDNO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ IDNO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ IDNO:65 and SEQ ID NO:5. Such hybridization conditions are known to thoseskilled in the art and can be found in Current Protocols in MolecularBiology, Ausubel et al., eds., John Wiley & Sons, Inc. (1995), sections2, 4 and 6. Additional stringent conditions can be found in MolecularCloning: A Laboratory Manual, Sambrook et al., Cold Spring Harbor Press,Cold Spring Harbor, N.Y. (1989), chapters 7, 9 and 11. A preferred,non-limiting example of stringent hybridization conditions includeshybridization in 4×sodium chloride/sodium citrate (SSC), at about 65-70°C. (or hybridization in 4×SSC plus 50% formamide at about 42-50° C.)followed by one or more washes in 1×SSC, at about 65-70° C. A preferred,non-limiting example of highly stringent hybridization conditionsincludes hybridization in 1×SSC, at about 65-70° C. (or hybridization in1×SSC plus 50% formamide at about 42-50° C.) followed by one or morewashes in 0.3×SSC, at about 65-70° C. A preferred, non-limiting exampleof reduced stringency hybridization conditions includes hybridization in4×SSC, at about 50-60° C. (or alternatively hybridization in 6×SSC plus50% formamide at about 40-45° C.) followed by one or more washes in2×SSC, at about 50-60° C. Ranges intermediate to the above-recitedvalues, e.g., at 65-70° C. or at 42-50° C. are also intended to beencompassed by the present invention. SSPE (1×SSPE is 0.15 M NaCl, 10 mMNaH₂PO₄, and 1.25 mM EDTA, pH 7.4) can be substituted for SSC (1×SSC is0.15 M NaCl and 15 mM sodium citrate) in the hybridization and washbuffers; washes are performed for 15 minutes each after hybridization iscomplete. The hybridization temperature for hybrids anticipated to beless than 50 base pairs in length should be 5-10° C. less than themelting temperature (T_(m)) of the hybrid, where T_(m) is determinedaccording to the following equations. For hybrids less than 18 basepairs in length, T_(m)(° C.)=2(# of A+T bases)+4(# of G+C bases). Forhybrids between 18 and 49 base pairs in length, T_(m)(°C.)=81.5+16.6(log₁₀[Na⁺])+0.41(% G+C)−(600/N), where N is the number ofbases in the hybrid, and [Na⁺] is the concentration of sodium ions inthe hybridization buffer ([Na⁺] for 1×SSC=0.165 M). It will also berecognized by the skilled practitioner that additional reagents may beadded to hybridization and/or wash buffers to decrease non-specifichybridization of nucleic acid molecules to membranes, for example,nitrocellulose or nylon membranes, including but not limited to blockingagents (e.g., BSA or salmon or herring sperm carrier DNA), detergents(e.g., SDS), chelating agents (e.g., EDTA), Ficoll, PVP and the like.When using nylon membranes, in particular, an additional preferred,non-limiting example of stringent hybridization conditions ishybridization in 0.25-0.5M NaH₂PO₄, 7% SDS at about 65° C., followed byone or more washes at 0.02M NaH₂PO₄, 1% SDS at 65° C., see e.g., Churchand Gilbert (1984) Proc. Natl. Acad. Sci. USA 81:1991-1995, (or,alternatively, 0.2×SSC, 1% SDS). In another preferred embodiment, anisolated nucleic acid molecule comprises a nucleotide sequence that iscomplementary to a coaX nucleotide sequence as set forth herein (e.g.,is the full complement of the nucleotide sequence set forth as SEQ IDNO:19). Preferably, an isolated nucleic acid molecule of the inventionthat hybridizes under stringent conditions to the sequence of SEQ IDNO:SEQ ID NO:32, SEQ ID NO:69, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50,SEQ ID NO:1, SEQ ID NO:38, SEQ ID NO:31, SEQ ID NO:36, SEQ ID NO:52, SEQID NO:54, SEQ ID NO:23, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:28, SEQ IDNO:60, SEQ ID NO:27, SEQ ID NO:34 or SEQ ID NO:68, SEQ ID NO:25, SEQ IDNO:40, SEQ ID NO:44, SEQ ID NO:42, SEQ ID NO:35 or SEQ ID NO:37, SEQ IDNO:62, SEQ ID NO:26, SEQ ID NO:24, SEQ ID NO:33, SEQ ID NO:29, SEQ IDNO:64, SEQ ID NO:30 and SEQ ID NO:66, or to a complement thereof,corresponds to a naturally-occurring nucleic acid molecule. As usedherein, a “naturally-occurring” nucleic acid molecule refers to an RNAor DNA molecule having a nucleotide sequence that occurs in nature.

[0038] A nucleic acid molecule of the present invention (e.g., a coaXnucleic acid molecule or gene), can be isolated using standard molecularbiology techniques and the sequence information provided herein. Forexample, nucleic acid molecules can be isolated using standardhybridization and cloning techniques (e.g., as described in Sambrook,J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A LaboratoryManual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., 1989) or can be isolated bythe polymerase chain reaction using synthetic oligonucleotide primersdesigned based upon the coaX nucleotide sequences set forth herein, orflanking sequences thereof. A nucleic acid of the invention (e.g., acoaX nucleic acid molecule or gene), can be amplified using cDNA, mRNAor alternatively, chromosomal DNA, as a template and appropriateoligonucleotide primers according to standard PCR amplificationtechniques. Assays for identifying coaX gene of the present invention orhomologues thereof can be accomplished, for example, by expressing thecoaX gene in a microorganism, for example, a microorganism whichexpresses pantothenate kinase in a temperature-sensitive manner, andassaying the gene for the ability to complement a temperature sensitive(Ts) mutant for pantothenate kinase activity. A coaX gene that encodes afunctional pantothenate kinase is one that complements the Ts mutant.

[0039] Yet another embodiment of the present invention features mutantcoaX and coaA nucleic acid molecules or genes. The phrase “mutantnucleic acid molecule” or “mutant gene” as used herein, includes anucleic acid molecule or gene having a nucleotide sequence whichincludes at least one alteration (e.g., substitution, insertion,deletion) such that the polypeptide or protein that may be encoded bysaid mutant exhibits an activity that differs from the polypeptide orprotein encoded by the wild-type nucleic acid molecule or gene.Preferably, a mutant nucleic acid molecule or mutant gene (e.g., amutant coaA or coaX gene) encodes a polypeptide or protein having areduced activity (e.g., having a reduced pantothenate kinase activity)as compared to the polypeptide or protein encoded by the wild-typenucleic acid molecule or gene, for example, when assayed under similarconditions (e.g., assayed in microorganisms cultured at the sametemperature). A mutant gene also can encode no polypeptide or have areduced level of production of the wild-type polypeptide.

[0040] As used herein, a “reduced activity” or “reduced enzymaticactivity” is one that is at least 5% less than that of the polypeptideor protein encoded by the wild-type nucleic acid molecule or gene,preferably at least 5-10% less, more preferably at least 10-25% less andeven more preferably at least 25-50%, 50-75% or 75-100% less than thatof the polypeptide or protein encoded by the wild-type nucleic acidmolecule or gene. Ranges intermediate to the above-recited values, e.g.,75-85%, 85-90%, 90-95%, are also intended to be encompassed by thepresent invention. As used herein, a “reduced activity” or “reducedenzymatic activity” also includes an activity that has been deleted or“knocked out” (e.g., approximately 100% less activity than that of thepolypeptide or protein encoded by the wild-type nucleic acid molecule orgene). Activity can be determined according to any well accepted assayfor measuring activity of a particular protein of interest. Activity canbe measured or assayed directly, for example, measuring an activity of aprotein isolated or purified from a cell. Alternatively, an activity canbe measured or assayed within a cell or in an extracellular medium or ina crude extract of cells.

[0041] It will be appreciated by the skilled artisan that even a singlesubstitution in a nucleic acid or gene sequence (e.g., a basesubstitution that encodes an amino acid change in the correspondingamino acid sequence) can dramatically affect the activity of an encodedpolypeptide or protein as compared to the corresponding wild-typepolypeptide or protein. A mutant nucleic acid or mutant gene (e.g.,encoding a mutant polypeptide or protein), as defined herein, is readilydistinguishable from a nucleic acid or gene encoding a proteinhomologue, as described above, in that a mutant nucleic acid or mutantgene encodes a protein or polypeptide having an altered activity,optionally observable as a different or distinct phenotype in amicroorganism expressing said mutant gene or nucleic acid or producingsaid mutant protein or polypeptide (i.e., a mutant microorganism) ascompared to a corresponding microorganism expressing the wild-type geneor nucleic acid or producing said mutant protein or polypeptide. Bycontrast, a protein homologue has an identical or substantially similaractivity, optionally phenotypically indiscernable when produced in amicroorganism, as compared to a corresponding microorganism expressingthe wild-type gene or nucleic acid. Accordingly it is not, for example,the degree of sequence identity between nucleic acid molecules, genes,protein or polypeptides that serves to distinguish between homologuesand mutants, rather it is the activity of the encoded protein orpolypeptide that distinguishes between homologues and mutants:homologues having, for example, low (e.g., 30-50% sequence identity)sequence identity yet having substantially equivalent functionalactivities, and mutants, for example sharing 99% sequence identity yethaving dramatically different or altered functional activities.Exemplary homologues are set forth as SEQ ID NOs:SEQ ID NO:12, SEQ IDNO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ IDNO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ ID NO:59, SEQ IDNO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ ID NO:4, SEQ IDNO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ IDNO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ ID NO:65 and SEQID NO:5 (i.e., CoaX homologues). Exemplary mutants are described inExamples III-IV herein.

[0042] III. CoaX Proteins

[0043] Another aspect of the present invention features isolatedproteins (e.g., isolated CoaX proteins encoded, for example, by any oneof the coaX genes or nucleic acids described herein). In one embodiment,the isolated proteins are produced by recombinant DNA techniques and canbe isolated from microorganisms expressing, for example, any one of thecoaX genes or nucleic acids described herein, by an appropriatepurification scheme using standard protein purification techniques. Inanother embodiment, proteins are synthesized chemically using standardpeptide synthesis techniques.

[0044] An “isolated” or “purified” protein (e.g., an isolated orpurified CoaX enzyme) is substantially free of cellular material orother contaminating proteins from the microorganism from which theprotein is derived, or substantially free from chemical precursors orother chemicals when chemically synthesized. In one embodiment, anisolated or purified protein has less than about 30% (by dry weight) ofcontaminating protein or chemicals, more preferably less than about 20%of contaminating protein or chemicals, still more preferably less thanabout 10% of contaminating protein or chemicals, and most preferablyless than about 5% contaminating protein or chemicals.

[0045] A “partially purified” protein (e.g., a partially purified CoaXenzyme) is a composition comprising a protein of interest where thecomposition has been subjected to at least one purification step,separation step, concentration step, or the like, such that the proteinof interest is present at a greater concentration or level than prior tothe purification step, separation step, concentration step, or the like.In one embodiment, a partially purified protein has between about 50-65%(by dry weight) of contaminating protein or chemicals, preferablybetween about 40%-50% of contaminating protein or chemicals, morepreferably between about 30-40% of contaminating protein or chemicals.

[0046] Included within the scope of the present invention are CoaXproteins encoded by naturally-occurring bacterial or microbial genes,for example, by coaX genes derived from a microorganism selected fromthe group consisting of Aquifex aeolicus, Bacillus anthracis, Bacillushalodurans, Bacillus stearothermophilus, Bacillus subtilis, Caulobactercrescentus, Chlorobium tepidum, Clostridium acetobutylicum,Dehalococcoides ethenogenes, Deinococcus radiodurans, Desulfovibriovulgaris, Geobacter sulfurreducens, Pseudomonas putida, Rhodobactercapsulatus, Thiobacillus ferrooxidans, Streptomyces coelicolor,Synechocystis sp., Thermotoga maritima, Bordetella pertussis, Borreliaburgdorferi, Campylobacter jejuni, Clostridium difficile, Helicobacterpylori, Neisseria meningitidis, Neisseria gonorrhoeae, Porphyromonasgingivalis, Pseudomonas aeruginosa, Treponema pallidum, Xylellafastidiosa and Mycobacterium tuberculosis. Further included within thescope of the present invention are CoaX proteins that are encodedbacterial or microbial genes which differ from naturally-occurringbacterial or microbial genes described herein, for example, genes whichhave nucleic acids that are mutated, inserted or deleted, but whichencode proteins substantially similar to the naturally-occurring geneproducts of the present invention. For example, it is well understoodthat one of skill in the art can mutate (e.g., substitute) nucleic acidswhich, due to the degeneracy of the genetic code, encode for anidentical amino acid as that encoded by the naturally-occurring gene.Moreover, it is well understood that one of skill in the art can mutate(e.g., substitute) nucleic acids which encode for conservative aminoacid substitutions. It is further well understood that one of skill inthe art can substitute, add or delete amino acids to a certain degreewithout substantially affecting the function of a gene product ascompared with a naturally-occurring gene product, each instance of whichis intended to be included within the scope of the present invention.

[0047] In one embodiment, an isolated protein of the present inventionis encoded by a coaX gene derived from a microorganism selected from thegroup consisting of Aquifex aeolicus, Bacillus anthracis, Bacillushalodurans, Bacillus stearothermophilus, Bacillus subtilis, Caulobactercrescentus, Chlorobium tepidum, Clostridium acetobutylicum,Dehalococcoides ethenogenes, Deinococcus radiodurans, Desulfovibriovulgaris, Geobacter sulfurreducens, Pseudomonas putida, Rhodobactercapsulatus, Thiobacillus ferrooxidans, Streptomyces coelicolor,Synechocystis sp., Thermotoga maritima, Bordetella pertussis, Borreliaburgdorferi, Campylobacter jejuni, Clostridium difficile, Helicobacterpylori, Neisseria meningitidis, Neisseria gonorrhoeae, Porphyromonasgingivalis, Pseudomonas aeruginosa, Treponema pallidum, Xylellafastidiosa and Mycobacterium tuberculosis. In another embodiment, anisolated protein of the present invention is encoded by a coaX genederived from a microorganism selected from the group consisting ofBacillus anthracis, Bordetella pertussis, Borrelia burgdorferi,Campylobacter jejuni, Clostridium difficile, Helicobacter pylori,Neisseria meningitidis, Neisseria gonorrhoeae, Porphyromonas gingivalis,Pseudomonas aeruginosa, Legionella pneumophila, Treponema pallidum andXylella fastidiosa (e.g., is encoded by a coaX gene derived from apathogenic bacteria). In yet another embodiment, an isolated protein ofthe present invention is encoded by a coaX gene derived from amicroorganism selected from the group consisting of Bordetellapertussis, Borrelia burgdorferi, Campylobacter jejuni, Clostridiumdifficile, Helicobacter pylori, Neisseria meningitidis, Pseudomonasaeruginosa, Treponema pallidum and Xylella fastidiosa (e.g., is encodedby a coaX gene derived from a pathogenic bacteria which has coaX as it'ssole pantothenate kinase encoding enzyme). In a preferred embodiment, anisolated protein of the present invention (e.g., a CoaX) has an aminoacid sequence as set forth in any one of SEQ ID NOs:SEQ ID NO:12, SEQ IDNO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:2, SEQ IDNO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ IDNO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ IDNO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ IDNO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ IDNO:65 and SEQ ID NO:5. In other embodiments, an isolated protein of thepresent invention (e.g., a CoaX) is a homologue of the at least one ofthe proteins set forth as SEQ ID NOs:SEQ ID NO:12, SEQ ID NO:70, SEQ IDNO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:2, SEQ ID NO:51, SEQ IDNO.53, SEQ ID NO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ ID NO:59, SEQ IDNO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ ID NO:4, SEQ IDNO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ IDNO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ ID NO:65 and SEQID NO:5 (e.g., comprises an amino acid sequence at least about 30-40%identical, preferably about 40-50% identical, more preferably about50-60% identical, and even more preferably about 60-70%, 70-80%, 80-90%,90-95% or more identical to the amino acid sequence of SEQ ID NOs:SEQ IDNO: 12, SEQ ID NO:70, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ IDNO:2, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ IDNO:8, SEQ ID NO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ IDNO:63, SEQ ID NO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ IDNO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ IDNO:10, SEQ ID NO:65 and SEQ ID NO:5, and has an activity that issubstantially similar to that of the protein encoded by the amino acidsequence of SEQ ID NOs:SEQ ID NO:12, SEQ ID NO:70, SEQ ID NO:45, SEQ IDNO:47, SEQ ID NO:49, SEQ ID NO:2, SEQ ID NO:51, SEQ ID NO:53, SEQ IDNO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ ID NO:59, SEQ ID NO:7, SEQ IDNO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ ID NO:4, SEQ ID NO:13, SEQ IDNO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ IDNO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ ID NO:65 and SEQ ID NO:5,respectively.

[0048] To determine the percent homology of two amino acid sequences orof two nucleic acids, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in the sequence of a first aminoacid or nucleic acid sequence for optimal alignment with a second aminoor nucleic acid sequence). When a position in the first sequence isoccupied by the same amino acid residue or nucleotide as thecorresponding position in the second sequence, then the molecules areidentical at that position. The percent identity between the twosequences is a function of the number of identical positions shared bythe sequences (i.e., % identity=# of identical positions/total # ofpositions×100), preferably taking into account the number of gaps andsize of said gaps necessary to produce an optimal alignment.

[0049] The comparison of sequences and determination of percent homologybetween two sequences can be accomplished using a mathematicalalgorithm. A preferred, non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Karlin andAltschul (1990) Proc. Natl. Acad. Sci. USA 87:2264-68, modified as inKarlin and Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-77. Suchan algorithm is incorporated into the NBLAST and XBLAST programs(version 2.0) of Altschul et al (1990) J. Mol. Biol. 215:403-10. BLASTnucleotide searches can be performed with the NBLAST program, score=100,wordlength=12 to obtain nucleotide sequences homologous to nucleic acidmolecules of the invention. BLAST protein searches can be performed withthe XBLAST program, score=50, wordlength=3 to obtain amino acidsequences homologous to protein molecules of the invention. To obtaingapped alignments for comparison purposes, Gapped BLAST can be utilizedas described in Altschul et al. (1997) Nucleic Acids Research25(17):3389-3402. When utilizing BLAST and Gapped BLAST programs, thedefault parameters of the respective programs (e.g., XBLAST and NBLAST)can be used. See http://www.ncbi.nlm.nih.gov. Another preferred,non-limiting example of a mathematical algorithm utilized for thecomparison of sequences is the algorithm of Myers and Miller (1988)Comput Appl Biosci. 4:11-17. Such an algorithm is incorporated into theALIGN program available, for example, at the GENESTREAM network server,IGH Montpellier, FRANCE (http://vega.igh.cnrs.fr) or at the ISREC server(http://www.ch.embnet.org). When utilizing the ALIGN program forcomparing amino acid sequences, a PAM120 weight residue table, a gaplength penalty of 12, and a gap penalty of 4 can be used.

[0050] In another preferred embodiment, the percent homology between twoamino acid sequences can be determined using the GAP program in the GCGsoftware package (available at http://www.gcg.com), using either aBlossom 62 matrix or a PAM250 matrix, and a gap weight of 12, 10, 8, 6,or 4 and a length weight of 2, 3, or 4. In yet another preferredembodiment, the percent homology between two nucleic acid sequences canbe accomplished using the GAP program in the GCG software package(available at http://www.gcg.com), using a gap weight of 50 and a lengthweight of 3.

[0051] VI. Recombinant Nucleic Acid Molecules, Vectors andMicroorganisms

[0052] The present invention further features recombinant nucleic acidmolecules (e.g., recombinant DNA molecules) that include nucleic acidmolecules and/or genes described herein (e.g., isolated nucleic acidmolecules and/or genes), preferably pantothenate kinase-encoding genes(e.g., coaX genes). The present invention further features vectors(e.g., recombinant vectors) that include nucleic acid molecules (e.g.,isolated or recombinant nucleic acid molecules and/or genes) describedherein. In particular, recombinant vectors are featured that includenucleic acid sequences that encode bacterial gene products as describedherein, preferably bacterial nucleic acid sequences that encodebacterial pantothenate kinase proteins.

[0053] The term “recombinant nucleic acid molecule” includes a nucleicacid molecule (e.g., a DNA molecule) that has been altered, modified orengineered such that it differs in nucleotide sequence from the nativeor natural nucleic acid molecule from which the recombinant nucleic acidmolecule was derived (e.g., by addition, deletion or substitution of oneor more nucleotides). Preferably, a recombinant nucleic acid molecule(e.g., a recombinant DNA molecule) includes an isolated nucleic acidmolecule or gene of the present invention (e.g., an isolated coaX gene)operably linked to regulatory sequences.

[0054] The term “recombinant vector” includes a vector (e.g., plasmid,phage, phasmid, virus, cosmid or other purified nucleic acid vector)that has been altered, modified or engineered such that it containsgreater, fewer or different nucleic acid sequences than those includedin the native or natural nucleic acid molecule from which therecombinant vector was derived. Preferably, the recombinant vectorincludes a coaX gene or recombinant nucleic acid molecule including suchcoaX gene, operably linked to regulatory sequences, for example,promoter sequences, terminator sequences and/or artificial ribosomebinding sites (RBSs), as defined herein.

[0055] The phrase “operably linked to regulatory sequence(s)” means thatthe nucleotide sequence of the nucleic acid molecule or gene of interestis linked to the regulatory sequence(s) in a manner which allows forexpression (e.g, enhanced, increased, constitutive, basal, attenuated,decreased or repressed expression) of the nucleotide sequence,preferably expression of a gene product encoded by the nucleotidesequence (e.g., when the recombinant nucleic acid molecule is includedin a recombinant vector, as defined herein, and is introduced into amicroorganism).

[0056] The term “regulatory sequence” includes nucleic acid sequenceswhich affect (e.g., modulate or regulate) expression of other nucleicacid sequences. In one embodiment, a regulatory sequence is included ina recombinant nucleic acid molecule or recombinant vector in a similaror identical position and/or orientation relative to a particular geneof interest as is observed for the regulatory sequence and gene ofinterest as it appears in nature, e.g., in a native position and/ororientation. For example, a gene of interest can be included in arecombinant nucleic acid molecule or recombinant vector operably linkedto a regulatory sequence which accompanies or is adjacent to the gene ofinterest in the natural organism (e.g., operably linked to “native”regulatory sequences, for example, to the “native” promoter).Alternatively, a gene of interest can be included in a recombinantnucleic acid molecule or recombinant vector operably linked to aregulatory sequence which accompanies or is adjacent to another (e.g., adifferent) gene in the natural organism. Alternatively, a gene ofinterest can be included in a recombinant nucleic acid molecule orrecombinant vector operably linked to a regulatory sequence from anotherorganism. For example, regulatory sequences from other microbes (e.g.,other bacterial regulatory sequences, bacteriophage regulatory sequencesand the like) can be operably linked to a particular gene of interest.

[0057] In one embodiment, a regulatory sequence is a non-native ornon-naturally-occurring sequence (e.g., a sequence which has beenmodified, mutated, substituted, derivatized, deleted including sequenceswhich are chemically synthesized). Preferred regulatory sequencesinclude promoters, enhancers, termination signals, anti-terminationsignals and other expression control elements (e.g., sequences to whichrepressors or inducers bind and/or binding sites for transcriptionaland/or translational regulatory proteins, for example, in thetranscribed mRNA). Such regulatory sequences are described, for example,in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: ALaboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. Regulatorysequences include those which direct constitutive expression of anucleotide sequence in a microorganism (e.g., constitutive promoters andstrong constitutive promoters), those which direct inducible expressionof a nucleotide sequence in a microorganism (e.g., inducible promoters,for example, xylose inducible promoters) and those which attenuate orrepress expression of a nucleotide sequence in a microorganism (e.g.,attenuation signals or repressor sequences). It is also within the scopeof the present invention to regulate expression of a gene of interest byremoving or deleting regulatory sequences. For example, sequencesinvolved in the negative regulation of transcription can be removed suchthat expression of a gene of interest is enhanced.

[0058] In one embodiment, a recombinant nucleic acid molecule orrecombinant vector of the present invention includes a nucleic acidsequence or gene that encodes at least one bacterial gene product (e.g.,a gene product encoded by coaX) operably linked to a promoter orpromoter sequence. Preferred promoters of the present invention includeE. coli promoters or Bacillus promoters and/or bacteriophage promoters(e.g., bacteriophage which infect E. coli or Bacillus). In oneembodiment, a promoter is a Bacillus promoter, preferably a strongBacillus promoter (e.g., a promoter associated with a biochemicalhousekeeping gene in Bacillus or a promoter associated with a glycolyticpathway gene in Bacillus). In another embodiment, a promoter is abacteriophage promoter. In a preferred embodiment, the promoter is fromthe bacteriophage SPO 1. In a particularly preferred embodiment, apromoter is the P₂₆ promoter set forth as SEQ ID NO:18 or the P₁₅promoter set forth as SEQ ID NO:19. Additional preferred promotersinclude tef(the translational elongation factor (TEF) promoter) and pyc(the pyruvate carboxylase (PYC) promoter), which promote high levelexpression in Bacillus (e.g., Bacillus subtilis). Additional preferredpromoters, for example, for use in Gram positive microorganisms include,but are not limited to, the amyE promoter or phage SP02 promoters.Additional preferred promoters, for example, for use in Gram negativemicroorganisms include, but are not limited to tac, trp, tet, trp-tet,lpp, lac, lpp-lac, lacIq, T7, T5, T3, gal, trc, ara, SP6, λ-P_(R) orλ-P_(L).

[0059] In another embodiment, a recombinant nucleic acid molecule orrecombinant vector of the present invention includes a terminatorsequence or terminator sequences (e.g., transcription terminatorsequences). The term “terminator sequences” includes regulatorysequences which serve to terminate transcription of a gene. Terminatorsequences (or tandem transcription terminators) can further serve tostabilize mRNA (e.g., by adding structure to mRNA), for example, againstnucleases.

[0060] In yet another embodiment, a recombinant nucleic acid molecule orrecombinant vector of the present invention includes sequences whichallow for detection of the vector containing said sequences (i.e.,detectable and/or selectable markers), for example, sequences thatovercome auxotrophic mutations, for example, trpC or leuB, etc.,fluorescent markers, and/or calorimetric markers (e.g.,lacZ/β-galactosidase), and/or antibiotic resistance genes (e.g., amp ortet).

[0061] In yet another embodiment, a recombinant nucleic acid molecule orrecombinant vector of the present invention includes an artificialribosome binding site (RBS). The term “artificial ribosome binding site(RBS)” includes a site within an mRNA molecule (e.g., coded within DNA)to which a ribosome binds (e.g., to initiate translation) which differsfrom a native RBS (e.g., a RBS found in a naturally-occurring gene) byat least one nucleotide. Preferred artificial RBSs include about 5-6,7-8, 9-10, 11-12, 13-14, 15-16, 17-18, 19-20, 21-22, 23-24, 25-26,27-28, 29-30 or more nucletides of which about 1-2, 3-4, 5-6, 7-8, 9-10,11-12, 13-15 or more differ from the native RBS (e.g., the native RBS ofa gene of interest). Preferably nucleotides which differ are substitutedsuch that they are identical to one or more nucleotides of an ideal RBSfor a particular gene. Artificial RBSs can be used to replace thenaturally-occurring or native RBS associated with a particular gene.Artificial RBSs preferably increase translation of a particular gene.

[0062] In another embodiment, a recombinant vector of the presentinvention includes sequences that enhance replication in bacteria (e.g.,replication-enhancing sequences). In one embodiment,replication-enhancing sequences are derived from E. coli. In anotherembodiment, replication-enhancing sequences are derived from pBR322.

[0063] In yet another embodiment, a recombinant vector of the presentinvention includes antibiotic resistance genes. The term “antibioticresistance genes” includes sequences which promote or confer resistanceto antibiotics on the host organism. In one embodiment, the antibioticresistance genes are selected from the group consisting of cat(chloramphenicol resistance) genes, tet (tetracycline resistance) genes,amp (ampicillin resistence), erm (erythromycin resistance) genes, neo(neomycin resistance) genes and spec (spectinomycin resistance) genes.Recombinant vectors of the present invention can further includehomologous recombination sequences (e.g., sequences designed to allowrecombination of the gene of interest into the chromosome of the hostorganism). For example, amyE sequences can be used as homology targetsfor recombination into the host chromosome.

[0064] Preferred vectors of the present invention include, but are notlimited to, vectors set forth in FIGS. 8-10. It will further beappreciated by one of skill in the art that the design of a vector canbe tailored depending on such factors as the choice of microorganism tobe genetically engineered, the level of expression of gene productdesired and the like.

[0065] The methodologies of the present invention featuremicroorganisms, e.g., recombinant microorganisms, preferably includinggenes or vectors as described herein, in particular, pantothenate kinaseencoding genes or vectos. The term “recombinant” microorganism includesa microorganism (e.g., bacteria, yeast cell, fungal cell, etc.) whichhas been genetically altered, modified or engineered (e.g., geneticallyengineered) such that it exhibits an altered, modified or differentgenotype and/or phenotype (e.g., when the genetic modification affectscoding nucleic acid sequences of the microorganism) as compared to thenaturally-occurring microorganism from which it was derived. Preferably,a “recombinant” microorganism of the present invention has beengenetically engineered such that it overexpresses at least one bacterialgene or gene product (e.g., a pantothenate kinase encoding gene) asdescribed herein, preferably a pantothenate kinase encoding-geneincluded within a recombinant vector as described herein. The ordinaryskilled will appreciate that a microorganism expressing oroverexpressing a gene product produces or overproduces the gene productas a result of expression or overexpression of nucleic acid sequencesand/or genes encoding the gene product.

[0066] The term “overexpressed” or “overexpression” includes expressionof a gene product (e.g., a pantothenate kinase) at a level greater thanthat expressed prior to manipulation of a microorganism or in acomparable microorganism that has not been manipulated. In oneembodiment, a microorganism is genetically manipulated (e.g.,genetically engineered) to overexpress a level of gene product greaterthan that expressed prior to manipulation of the microorganism or in acomparable microorganism which has not been manipulated. Geneticmanipulation can include, but is not limited to, altering or modifyingregulatory sequences or sites associated with expression of a particulargene (e.g., by adding strong promoters, inducible promoters or multiplepromoters or by removing regulatory sequences such that expression isconstitutive), modifying the chromosomal location of a particular gene,altering nucleic acid sequences adjacent to a particular gene such as aribosome binding site or transcription terminator, increasing the copynumber of a particular gene, modifying proteins (e.g., regulatoryproteins, suppressors, enhancers, transcriptional activators and thelike) involved in transcription of a particular gene and/or translationof a particular gene product, or any other conventional means ofderegulating expression of a particular gene routine in the art(including but not limited to use of antisense nucleic acid molecules,for example, to block expression of repressor proteins).

[0067] In another embodiment, the microorganism can be physically orenvironmentally manipulated to overexpress a level of gene productgreater than that expressed prior to manipulation of the microorganismor in a comparable microorganism which has not been manipulated. Forexample, a microorganism can be treated with or cultured in the presenceof an agent known or suspected to increase transcription of a particulargene and/or translation of a particular gene product such thattranscription and/or translation are enhanced or increased.Alternatively, a microorganism can be cultured at a temperature selectedto increase transcription of a particular gene and/or translation of aparticular gene product such that transcription and/or translation areenhanced or increased.

[0068] Still other preferred recombinant microorganisms of the presentinvention are mutant microorganisms. As used herein, the term “mutantmicroorganism” includes a recombinant microorganism that has beengenetically engineered to express a mutated gene or protein that isnormally or naturally expressed by the microorganism. Preferably, amutant microorganism expresses a mutated gene or protein such that themicroorganism exhibits an altered, modified or different phenotype(e.g., has been engineered to express a mutated CoaA biosyntheticenzyme, for example, pantothenate kinase). In one embodiment, a mutantmicroorganism is designed or engineered such that it includes a mutantcoaX gene, as defined herein. In another embodiment, a recombinantmicroorganism is designed or engineered such that it includes a mutantcoaA gene, as defined herein. In another embodiment, a mutantmicroorganism is designed or engineered such that a coaX gene has beendeleted (i.e., the protein encoded by the coaX gene is not produced). Inanother embodiment, a mutant microorganism is designed or engineeredsuch that a coaA gene has been deleted (i. e., the protein encoded bythe coaA gene is not produced). Preferably, a mutant microorganism has amutant coaX gene or a mutant coaA gene, or has been engineered to have acoaX gene and/or coaA deleted, such that that the mutant microorganismencodes a “reduced pantothenate kinase activity”. In the context of awhole microorganism, pantothenate kinase activity can be determined bymeasuring or assaying for a decrease in an intermediate or product ofthe CoA biosynthetic pathway, for example, measuring or assaying for4′-phosphopantothenate, 4′-phosphopantothenylcysteine,4′-phosphopantetheine, dephosphocoenzyme A, Coenzyme A, apo-acyl carrierprotein (apo-ACP) or holo-acyl carrier protein (ACP) in themicroorganism (e.g., in a lysate isolated or derived from themicroorganism) or in the medium in which the microorganism is cultured.Alternatively, pantothenate kinase or CoaX activity can be determined bymeasuring or assaying for increased or decreased growth of themicroorganism. Alternatively, pantothenate kinase activity can bedetermined indirectly by measuring or assaying for an increase inpantothenate which is the immediate precursor of pantothenate kinase.

[0069] In one embodiment, a recombinant microorganism of the presentinvention is a Gram negative organism (e.g., a microorganism whichexcludes basic dye, for example, crystal violet, due to the presence ofa Gram-negative wall surrounding the microorganism). In anotherembodiment, a recombinant microorganism of the present invention is aGram positive organism (e.g., a microorganism which retains basic dye,for example, crystal violet, due to the presence of a Gram-positive wallsurrounding the microorganism). In a preferred embodiment, therecombinant microorganism is a microorganism belonging to a genusselected from the group consisting of Escherichia, Heliobacter,Pseudomonas, Bordetella and Bacillus. In a more preferred embodiment,the recombinant microorganism is of the genus Escherichia or Bacillus.

[0070] In another embodiment, the recombinant microorganism is a Gramnegative (excludes basic dye) organism. In a preferred embodiment, therecombinant microorganism is a microorganism belonging to a genusselected from the group consisting of Salmonella, Escherichia,Klebsiella, Serratia, and Proteus. In a more preferred embodiment, therecombinant microorganism is of the genus Escherichia. In an even morepreferred embodiment, the recombinant microorganism is Escherichia coli.In another embodiment, the recombinant microorganism is Saccharomyces(e.g., S. cerevisiae).

[0071] V. Screening Assays

[0072] Because CoaX is an essential factor in bacteria, proteins (e.g.,enzymes) involved in the biosynthesis of CoA provide valuable tools inthe search for novel antibiotics. In particular, the CoaX protein is avaluable target for identifying bacteriocidal compounds because it bearsno resemblance in primary sequence to mammalian pantothenate kinaseenzymes or CoaA's that are essential for beneficial enteric bacteriasuch as E. coli. Accordingly, the present invention also provides amethod (also referred to herein as a “screening assay”) for identifyingmodulators, i.e., candidate or test compounds or agents (e.g., peptides,peptidomimetics, small molecules or other drugs) that bind to CoaX, orhave a stimulatory or inhibitory effect on, for example, coaX expressionor CoaX activity.

[0073] In one embodiment, the invention provides assays for screeningcandidate or test compounds that are capable of binding to CoaX proteinsor a biologically active portion thereof. In another embodiment, theinvention provides assays for screening candidate or test compounds thatmodulate the activity of CoaX proteins or biologically active portionsthereof. As used herein, the phrase “CoaX” activity includes anydetectable or measurable activity of the CoaX protein, i.e., the proteinencoded by the coaX gene of the present invention, for example, the coaXgene derived from a microorganism selected from the group consisting ofAquifex aeolicus, Bacillus anthracis, Bacillus halodurans, Bacillusstearothermophilus, Bacillus subtilis, Caulobacter crescentus,Chlorobium tepidum, Clostridium acetobutylicum, Dehalococcoidesethenogenes, Deinococcus radiodurans, Desulfovibrio vulgaris, Geobactersulfurreducens, Pseudomonas putida, Rhodobacter capsulatus, Thiobacillusferrooxidans, Streptomyces coelicolor, Synechocystis sp., Thermotogamaritima, Bordetella pertussis, Borrelia burgdorferi, Campylobacterjejuni, Clostridium difficile, Helicobacter pylori, Neisseriameningitidis, Neisseria gonorrhoeae, Porphyromonas gingivalis,Pseudomonas aeruginosa, Treponema pallidum, Xylella fastidiosa,Legionella pneumophila, and Mycobacterium tuberculosis. In a preferredembodiment, a CoaX activity is at least one of the following: (1)modulation of at least one step in the CoA biosynthetic pathway; (2)promotion of CoA biosynthesis; (3) phosphorylation of a CoaX substrate;(4) a pantothenate kinase activity; and (4) complementation of a CoaXmutant.

[0074] The test compounds of the present invention can be obtained usingany of the numerous approaches in chemical compound library methodsknown in the art, including: natural compound libraries; biologicallibraries; spatially addressable parallel solid phase or solution phaselibraries; synthetic library methods requiring deconvolution; the‘one-bead one-compound’ library method; and synthetic library methodsusing affinity chromatography selection. The biological library approachis limited to peptide libraries, while the other approaches areapplicable to peptide, non-peptide oligomer or small molecule librariesof compounds (Lam, K.S. (1997) Anticancer Drug Des. 12:145).

[0075] Examples of methods for the synthesis of molecular libraries canbe found in the art, for example in: DeWitt et al. (1993) Proc. Natl.Acad. Sci. U.S.A. 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA91:11422; Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al.(1993) Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed.Engl. 33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061;and in Gallop et al. (1994) J. Med. Chem. 37:1233. Libraries ofcompounds may be presented in solution (e.g., Houghten (1992)Biotechniques 13:412-421), or on beads (Lam (1991) Nature 354:82-84),chips (Fodor (1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No.5,223,409), spores (Ladner USP '409), plasmids (Cull et al. (1992) ProcNatl Acad Sci USA 89:1865-1869) or on phage (Scott and Smith (1990)Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla etal. (1990) Proc. Natl. Acad. Sci. 87:6378-6382); (Felici (1991) J. Mol.Biol. 222:301-310); (Ladner supra.).

[0076] In one embodiment, an assay is a microorganism-based assay inwhich a recombinant microorganism that expresses a CoaX protein orbiologically active portion thereof is contacted with a test compoundand the ability of the test compound to modulate CoaX activity isdetermined. Determining the ability of the test compound to modulateCoaX activity can be accomplished by monitoring, for example, growth,intracellular phosphopanthoate or CoA concentrations, or secretedpantothenate concentrations (as compounds that inhibit CoaX will resultin a buildup of pantothenate in the test microorganism). CoaX substratecan be labeled with a radioisotope or enzymatic label such thatmodulation of CoaX activity can be determined by detecting a conversionof labeled substrate to intermediate or product. For example, CoaXsubstrates can be labeled with ³²P, ¹⁴C, or ³H, either directly orindirectly, and the radioisotope detected by direct counting ofradioemmission or by scintillation counting. Determining the ability ofa compound to modulate CoaX activity can alternatively be determined bydetecting the induction of a reporter gene (comprising a CoA-responsiveregulatory element operatively linked to a nucleic acid encoding adetectable marker, e.g., luciferase), or detecting a CoA-regulatedcellular response.

[0077] In yet another embodiment, a screening assay of the presentinvention is a cell-free assay in which a CoaX protein or biologicallyactive portion thereof is contacted with a test compound in vitro andthe ability of the test compound to bind to or modulate the activity ofthe CoaX protein or biologically active portion thereof is determined.In a preferred embodiment, the assay includes contacting the CoaXprotein or biologically active portion thereof with known substrates toform an assay mixture, contacting the assay mixture with a testcompound, and determining the ability of the test compound to modulateenzymatic activity of the CoaX on its substrates.

[0078] Screening assays can be accomplished in any vessel suitable forcontaining the microorganisms, proteins, and/or reactants. Examples ofsuch vessels include microtiter plates, test tubes, and micro-centrifugetubes. In more than one embodiment of the above assay methods of thepresent invention, it may be desirable to immobilize either CoaXprotein, CoaX substrate, substrate analogs or a recombinantmicroorganism expressing CoaX protein to facilitate separation ofproducts, ligands, and/or substrates, as well as to accommodateautomation of the assay. For example, glutathione-S-transferase/CoaXfusion proteins can be adsorbed onto glutathione sepharose beads (SigmaChemical, St. Louis, Mo.) or glutathione derivatized microtiter plates.Other techniques for immobilizing proteins on matrices (e.g.,biotin-conjugation and streptavidin immobilization or antibodyconjugation) can also be used in the screening assays of the invention.

[0079] This invention further pertains to novel agents identified by theabove-described screening assays. Accordingly, it is within the scope ofthis invention to further use an agent identified as described herein inan appropriate animal model. For example, a CoaX modulating agentidentified as described herein (e.g., an anti-bactericidal compound) canbe used in an infectious animal model to determine the efficacy,toxicity, or side effects of treatment with such an agent.

[0080] CoaX modulators can further be designed based on the crystalstructure of any one of the CoaX proteins of the present invention. Inparticular, based at least in part on the discovery of CoaX as anessential bacterial protein, one can produce significant quantities ofthe CoaX protein, for example using the recombinant methodologies asdescribed herein, purify and crystallize said protein, subject saidprotein to Xray crystallographic procedures and, based on the determinedcrystal structure, design modulators (e.g., active site modulators, forexample, competitor molecules, active site inhibitors, and the like),and test said designed modulators according to any one of the assaysdescribed herein.

[0081] This invention is further illustrated by the following exampleswhich should not be construed as limiting. The contents of allreferences, patents and published patent applications cited throughoutthis application are incorporated herein by reference.

EXAMPLES Example I Assays for CoaX Genes or Activities

[0082] Assay for Pantothenate Kinase Genes or in Vivo PantothenateKinase Activity

[0083] In order to assay for genes encoding pantothenate kinase, theability of plasmids containing these genes to complement thecoaA15(Ts15) mutation in E. coli strain YH1 is tested at thenon-permissive temperature of 43′-44° C. The defect in E. colicoaA15(Ts) has been identified as an S177L mutation that lies in aregion that is highly conserved among bacterial pantothenate kinases,including CoaA of B. subtilis. Strain YH1 was constructed by P1transduction from publically available strain DV62 (Coli Genetic StockCenter) to publically available strain YMC9 (ATCC), selecting fortetracycline resistance and screening for temperature sensitivity at 43°C.

[0084] In Vitro Assay for Pantothenate Kinase Activity

[0085] The assay for pantothenate kinase is based on the fact that underappropriate mildly acidic conditions (1% acetic acid in 95% ethanol),the product of the reaction, 4′-phosphopantothenate, binds to positivelycharged ion exchange paper, while the substrate, pantothenate, does not(see Vallari, D., Jackowski, S., and Rock, C., (1987), Journal ofBiological Chemistry, Vol. 262, pp2468-2471, hereby incorporated byreference).

[0086] Cells of the strain to be assayed (bacteria, yeast, fungi,animal, or plant cells) are grown to late logarithmic phase orstationary phase, in 200 ml of an appropriate medium, for example LuriaBroth or M9 minimal salts plus 0. 5% glucose plus any necessaryadditives (for bacterial cells), at an appropriate temperature (25 to44° C.). All subsequent steps are carried out at 0 to 40° C. The cultureis cooled on ice for 10 minutes and the cells are concentrated bycentrifugation at 7,000×g for 10 minutes. The cell pellet is rinsed byresuspending it in ice cold Buffer A (50 mM Tris-HCl, pH 7.4, 2.5 mMMgCl₂) and recentrifugation.

[0087] The rinsed cells are resuspended in the minimum possible volume(2-5 ml, depending on the size of the pellet) of Buffer A. The cells arethen broken open by sonication in an inverted stainless steel test tubecap on ice for four bursts of 15 seconds each with 30 seconds of coolingbetween each burst. Cell debris is then removed from the lysed cells bycentrifugation at 10,000×g for 10 minutes. The supernatant solution isthen dialyzed for 12-16 hrs against two changes of one liter of Buffer Awith 0.1 mM dithiothreitol added. Dialysis may be necessary to preventthe reaction product from undergoing further reactions catalyzed by thecrude cell extract. Protein concentration in the dialyzed extracts ismeasured with a BCA Protein Assay Kit from BioRad.

[0088] The assay mix contains (final amounts or concentrations) aboutzero to 150 μg protein, 80 μM ¹⁴C-D-pantothenate, specific activityabout 60,000 dpm/nmole (purchased from American Radiolabeled Chemicals,Inc.), 2.5 mM ATP (Sigma Chemical Company, sodium salt), 2.5 mM MgCl2,and 100 mM Tris HCl, pH 7.4, in a total volume of 40 μl. The reactionmix, minus the ATP, can be preincubated for about 1 to 10 minutes at anappropriate temperature (25 to 55° C.), in which case the reaction isstarted by addition of the ATP from a concentrated stock, alsopreincubated at the assay temperature.

[0089] After incubation for 1 to 10 minutes, the reaction is stopped bypipetting 35 μl of the reaction mix into an Eppendorf tube containing 1ml of 95% ethanol, 1% acetic acid. After thorough mixing, theprecipitated protein is pelleted in a microcentrifuge at top speed forone minute. The resulting supernatant solution is then applied to a oneinch (or other appropriate size) disk of Whatman DE81 ion exchangefilter paper prewetted with distilled water in a vacuum filtrationmanifold (for example Millipore 1225 Sampling Manifold). Each disk isthen rinsed three times with 10 ml of 1% acetic acid in 95% ethanol. Thetop plate is then removed from the manifold and the completely exposedfilter disks are each rinsed once more with 5 ml of the same rinsesolution. The rinsed filters are then counted in a scintillation counterappropriately set for ¹⁴C in 10 ml of Ecolume scintillation fluid. Thespecific activity of the pantothenate kinase can be calculated bydetermining the number of moles of substrate converted to product per mgprotein per minute under various appropriate conditions of the assay.

[0090] Appropriate assay blanks include, but are not limited to, thestandard mix except without ATP or without protein extract, or acomplete mix incubated on ice for the shortest possible time beforepipetting to the filter disk (preferably less than 10 seconds).

[0091] The assay should be checked for linearity with time up to 10minutes, and for linearity with protein between zero and 150 μg. No morethan 10% of the input 14C-pantothenate should be converted tophosphorylated product for the most accurate measurement of activity.

[0092] Temperature sensitivity of the pantothenate kinase enzyme can betested by preincubating the reaction mix at various temperatures (25 to55° C.) for various lengths of time (zero to 60 minutes) before additionof ATP to start the reaction.

[0093] For pantothenate kinases other than that encoded by the E. colicoaA gene, the optimum temperature, pH, MgCl₂ concentration, bufferingion, ATP (or other substrate containing a high energy phosphate donor)concentration, salt type and concentration, total ionic strength, etc.,may need to be determined. For accurate determination of enzymeactivity, it may be necessary to purify or partially purify thepantothenate kinase enzyme from crude extracts, for example by ammoniumsulfate fractionation and/or by column chromatography.

[0094] The assay may be adapted for high throughput screening, forexample by using γ-thio-ATP instead of ATP and then reacting thetransfered thio group with a conveniently detectable signalling molecule(see Jeong, S., and Nikiforov, T., (1999), Biotechniques Vol. 27, pp1232-1238; and Facemyer, K., and Cremo, C., (1992), Bioconjug. Chem.Vol. 3, pp 408-413, both of which are hereby incorporated by reference).

Example II Identification and Characterization of a First B. subtilisGene Encoding Pantothenate Kinase, the coaA Gene

[0095] The annotated version of the B. subtilis genome sequenceavailable on the “Subtilist” web site contained no gene labeled as coaA.However a homology search using the protein sequence of E. colipantothenate kinase as a query sequence gave a good match with B.subtilis gene yqjS, which is annotated as “unknown; similar topantothenate kinase.” This gene appears to be the penultimate gene in anoperon containing five open reading frames (FIG. 2). Two of the openreading frames encode proteins which are similar to D-serine dehydrataseand to “ketoacyl reductase”; the other two have no known homologies. Forthe open reading frame corresponding to coaA, there are three possiblestart codons; each having a possible ribosome-binding site (RBS)associated with it. The three potential coaA ORFs were named coaA1,coaA2, and coaA3, from longest to shortest.

[0096] All three potential coaA open reading frames were cloned alongwith their respective RBSs by PCR followed by ligation into expressionplasmid pAN229 to form plasmids pAN281, pAN282 and pAN283. pAN229 is alow copy vector in E. coli that provides expression from the SPO1 phageP₁₅ promoter and can integrate by single crossover at bpr withtetracycline selection.

[0097] To determine if the cloned putative coaA ORFs actually encode apantothenate kinase activity, several isolates of all three plasmidswere transformed into the E. coli strain YH1, that contains thecoaA15(Ts) allele. Transformants were streaked to plates incubated at30° and 43° C. to test for complementation of the temperature sensitiveallele. Isolates of all three coaA variants complemented well at 43° C.,indicating that all three plasmid constructs encode an activepantothenate kinase. Accordingly, it can be concluded that the B.subtilis yqjS open reading frame codes for an active pantothenatekinase.

Example III Deletion of the coaA Gene from the B. subtilis Genome

[0098] The coaA gene of B. subtilis (yqjS) was deleted from thechromosome of a B. subtilis strain by conventional means. The majorityof the coaA coding sequence was deleted from a plasmid clone andreplaced by a chloramphenicol resistance gene (cat), while leavingapproximately 1 kb of upstream and downstream sequence to allowhomologous recombination with the chromosome, to give plasmid pAN296(see FIG. 3). pAN296 was then used to transform a B. subtilis strain(PY79), selecting for chloramphenicol resistance. The majority oftransformants result from a double crossover event that effectivelysubstitutes the cat gene for the coaA gene. The transformed straincontaining the coaA deletion—cat insertion, named PA861) grew normallyindicating the presence of a second B. subtilis pantothenate kinaseencoding gene described herein.

Example IV Identification and Characterization of a Second B. subtilisGene Encoding Pantothenate Kinase Activity, the coaX Gene

[0099] After finding that deletion of the coaA gene from the chromosomeof B. subtilis is not a lethal event (see Example III), it was concludedthat B. subtilis must contain a second gene that encodes an activepantothenate kinase, since pantothenate kinase is an essential enzymeactivity.

[0100] A second pantothenate kinase-encoding gene was identified bycomplementing the E. coli strain YH1 (coaA15(Ts)) with a B. subtilisgene bank and selecting for transformants that were able to grow at 43°C. Found among the transformants were two families of plasmids that hadoverlapping restriction maps within each family, but not between thefamilies. As expected, the restriction map of one family was identicalto that predicted from the B. subtilis genome sequence for the homologueof the E. coli coaA gene (which we named coaA also, see above) andsurrounding sequences. The other family had a restriction map that wascompletely non-overlapping with the first.

[0101] DNA sequencing of the ends of the cloned inserts from the secondfamily showed that the clones came from a region of the B. subtilischromosome that includes the 3′ end of the ftsH gene, the 5′ end of thesul gene, and all of the yacB, yacC, yacD, cysK, pabB, pabA and pabCgenes. None of the open reading frames of these cloned inserts showedhomology to any known pantothenate kinase sequences, either prokaryoticor eukaryotic.

[0102] Several deletions were created through the B. subtilis genomicsequences in the cloned inserts. Each deletion was tested forcomplementation of the E. coli temperature sensitive pantothenatekinase. In particular, a deletion that removed all DNA between a Stu Isite in the cloning vector and a Swa I site in the yacC gene, leavesyacB as the only intact open reading frame in the cloned insert (seeFIG. 4). This deleted plasmid still complemented the E. colipantothenate kinase mutant. However, another deletion that removed DNAfrom the Swa I site in yacC through a Bst1107I site in the (alreadytruncated)ftsH gene, could not complement the E. coli pantothenatekinase mutant. From these results, it was concluded that the yacB openreading frame was responsible for the complementation activity. Toconfirm that yacB is a pantothenate kinase gene, the yacB ORF plus 112base pairs of downstream flanking sequence was amplified by PCR in twoindependent reactions and cloned downstream of a constitutive promote togive plasmids pAN341 and pAN342 (FIG. 5). Both pAN341 and pAN342complemented the defect in YH1 at 44° C., while a control plasmid, whichhas the same backbone, but expresses panBCD instead of yacB did not.This confirmed that the yacB open reading frame was responsible for thecomplementation of YH1.

[0103] As such, a novel gene that encodes pantothenate kinase activityin B. subtilis has been discovered that is not related by homology toany previously known pantothenate kinase gene. This gene has beenrenamed coaX, as a second, alternative gene that encodes an enzyme thatcatalyzes the first step in the pathway from pantothenate to CoaA. In B.subtilis strains deleted for coaA, coaX is an essential gene.

[0104] Several homologues of the B. subtilis coaX gene were identifiedby homology searching of various publically available databases usingthe published yacB (coaX) open reading frame sequence and predictedamino acid sequence (as set forth in SEQ ID NOs: 15 and 16respectively). In two cases (Mycobacterium tuberculosis and Streptomycescoelicolor) the homologous coaX genes are adjacent to, or almostadjacent to, pantothenate biosynthetic genes, consistent with thesehomologs having a role in pantothenate metabolism. The CoaX proteinsshow no homology to the CoaA family of pantothenate kinases, nor to theeukaryotic family of pantothenate kinases exemplified by PanK ofSaccharomyces cerevisiae.

[0105] Alignment of the amino acid sequences of several bacterial CoaXhomologs with the amino acid sequence predicted from translating the B.subtilis yacB ORF described in the published B. subtilis genome sequencerevealed that the CoaX proteins from other bacteria contained additionalamino acid residues at their carboxy-terminal ends. Moreover, theseextensions beyond the end of the predicted amino acid sequence for theB. subtilis gene product contained two relatively well conservedsegments of sequence.

[0106] Translation of nucleotide sequences just downstream from the stopcodon of the B. subtilis yacB ORF in a different reading frame revealedthe existence of amino acid sequences very similar to thecarboxy-terminal extensions of the other bacterial CoaX proteins. It isthus believed that an error exists in the published DNA sequence of theB. subtilis yacB ORF sequence that causes a frame shift leading to anartifactual downstream amino acid sequence and premature termination.

[0107] The PCR-generated sequences of B. subtilis coaX in pAN341 andpAN342 (described above) contain enough downstream flanking sequence toencode the putative carboxy-terminal extension described above, which isconsistent with the result that the clones were functional in thecomplementation assay. However when the 3′ PCR primer was positioned toinclude only the shorter yacB ORF predicted from the published sequence,but not to include the putative carboxy-terminal extension, then theresulting plasmids, pAN329 and pAN330 (similar in structure to pAN341and pAN342; see FIG. 5), did not complement the defect in YH1. Thisresult supports the notion that the published yacB coding sequencecontains a frame-shift error, and that the carboxy-terminal end of CoaXis necessary for pantothenate kinase activity. A predicted correctnucleotide sequence for B. subtilis coaX is set forth as SEQ ID NO:1 andthe translated amino acid sequence is set forth as SEQ ID NO:2. Amultiple sequence alignment of the CoaX amino acid sequences of B.subtilis and 11 homologues thereof is set forth in FIG. 6.

Example V Deleting the Second Pantothenate Kinase Gene, coaX Gene FromB. subtilis

[0108] With the knowledge gained above concerning the existence andnature of coaX, one can create a deletion of the coaX open reading framefrom the B. subtilis chromosome that will remove the encoded activity,and that will not adversely affect the expression of the genesdownstream from coaX. In such a deleted strain, the coaA gene will bethe only gene that encodes pantothenate kinase.

[0109] To delete the coaX gene from B. subtilis, plasmid pAN336, whichcontains upstream and downstream homology for double crossover, wasconstructed with a kanamycin resistance gene replacing most of the coaXORF (FIG. 7). Strain PY79 was transformed to kanamycin resistance bypAN336, and an isolate confirmed to have resulted from a doublecrossover by PCR was named PA876. As predicted, deletion of coaX byitself is not lethal for B. subtilis. Furthermore, chromosomal DNA fromPA876 would not transform competent PA861 (PY79 ΔcoaA::cat) to kanamycinresistance. These results indicate that it is the combination ofΔcoaA::cat and ΔcoaX::kan that is lethal for B. subtilis, confirmingthat B. subtilis contains two unlinked genes that encode pantothenatekinase, coaA and coaX, and that either gene alone is capable ofsupplying sufficient pantothenate kinase for a normal rate of growth.

Example VI Identification of coaX Homologs in Other Microbes

[0110] Database analyses reveal that many bacteria, in addition to B.subtilis, contain homologs of the CoaX pantothenate kinase. As shown inTables 1 and 2, both nonpathogenic and pathogenic bacteria can be foundthat contain homologs of this novel gene. TABLE 1 CoaX homologs inNon-Pathogens Genome CoaA Species complete homolog CoaX homolog AquifexYes NONE RAA00700 aeolicus aq_1924 AAC07720.1 pir∥E70465 Bacillus YesBH2875 BH0086 halodurans BAB06594.1 Bacillus No NONE? gnl|UOKNOR_1422|stearo- bstear_.Contig467 thermophilus Bacillus Yes RBS02372 YqjSRBS00070 YacB subtilis BAA12625.1 BAA05305.1 CAB14308.1 CAB11846.1pir∥C69965 pir∥S66100 Caulobacter No NONE? gnl|TIGR| crescentusC.crescentus_12574 Chlorobium No NONE? gnl|TIGR| tepidum C.tepidum_3499Clostridium No NONE? RCA03301 acetobutylicum gnl|GTC|C.aceto_gnlDehalococcoides No NONE? gnl|TIGR_61435| ethenogenes deth_1587Deinococcus Yes NONE AAF10040.1 radiodurans pir∥E75516 Desulfovibrio NoNONE? BAA21476.1 vulgaris P37564 gnl∥TIGR_881| dvulg_1371 Geobacter NoNONE? gnl|TIGR_35554| sulfurreducens gsulf_121 Pseudomonas No NONE?gnl|TIGR| putida pputida_10724 KT2440 Rhodobacter No NONE? RRC02473capsulatus Thiobacillus No NONE? gnl|TIGR| ferrooxidanst_ferrooxidans_6155 Streptomyces No COAA_STRCO SCE94.31c coelicolorg8469186 CAB40880.1 pir∥T35567 Synechocystis sp. Yes NONE ORF_ID:slr0812BAA18120 Thermotoga Yes NONE TM0883 maritima AAD35964.1 pir∥D72320

[0111] TABLE 2 CoaX homologs in Pathogens Genome com- CoaA Pathogenplete homolog CoaX homolog Comments Haemophilus Yes RHI13313 NONEinfluenzae Streptococcus No RST01295 NONE pyogenes Yersinia No RYP02180NONE pestis Vibrio Yes VC0320 NONE cholerae Bacillus No NONE? YESanthracis Bordetella No NONE? BAF (BVG pertussis ACCESSORY FACTOR)Borrelia Yes NONE BB0527 burgdorferi Campylobacter Yes NONE Cj0394cjejuni Clostridium No NONE? YES difficile Helicobacter Yes NONE jhp0796pylori (strain J99) HP0862 (strain 26695) AAD07916.1 Neisseria Yes NONENMA0357 CoaX is meningitidis (strain Z2491) fused NMB2075 to BirA(strain MC58) Neisseria No NONE? RNG00193 CoaX is gonorrhoeae fused toBirA Porphyro- No NONE? RPG01037 monas gnl|TIGR| gingivalisP.gingivalis_GP G.con Pseudomonas Yes NONE RPA06755 aeruginosa PA4279AAG07667.1 Treponema Yes NONE RTP00155 pallidum (TP0431) Xylella YesNONE XF1795 fastidiosa Legionella No gnl|CUCGC_46| pneumophila lpneumo_C930598. 2F12.S Mycobacterium No MLCB1222 leprae .23 Mycobacterium YesRMT04257 RMT02984 RMT04257 tuberculosis (Rv3600c)

[0112] Of particular interest are the seven human pathogens Helicobacterpylori, Borrelia burgdorferi, Pseudomonas aeruginosa, Campylobacterjejuni, Neisseria meningitidis, Treponema pallidum, and Bordetellapertussis, that contain the CoaX pantothenate kinase as their solepantothenate kinase activity. For these bacteria, the CoaX pantothenatekinase represents an attractive target for screening for new antibioticseffective against one or more of these pathogens. One can overproducethe particular CoaX pantothenate kinase and use the isolated protein,partially purified protein or crude cell extracts to screen in vitro forcompounds that modulate (e.g. inhibit) the pantothenate kinase activity.Alternatively, one can isolate compounds that specifically bind to theenzyme and test their ability to block the enzyme's activity. A knownkinase activity represents a particularly favorable target forhigh-throughput screening for compounds that modulate or decrease thatactivity.

[0113] Also of interest are other pathogens which contain a coaX gene,in particular, if it is demonstrated that these other pathogens containonly a single pantothenate kinase encoded by the coaX gene. Examples ofsuch bacteria are Porphyromonas gingivalis, Neisseria gonorrhoeae,Clostridium difficile, and Bacillus anthracis, all of which have beenshown to contain a coaX homolog. Determination whether or not they alsocontain a second pantothenate kinase encoded by a coaA homolog can bedetermined according to the methodologies taught in Examples II-IV.

Example VII Identification of coaX Homologs in Human Pathogens Lacking aConventional Prokaryotic Pantothenate Kinase

[0114] Human pathogens Helicobacter pylori (agent in gastoenteritus,stomach ulcers, and potentially stomach cancer), Borrelia burgdorferi(agent in Lyme's disease), Bordetella pertussis (agent in whoopingcough), and Pseudomonas aeruginosa (opportunistic pathogen in cysticfibrous) all contain homologs of the coaX gene of B. subtilis and nohomologs of the coaA gene of E. coli or B. subtilis.This is also truefor the pathogens Treponema pallidum, Campylobacter jejuni, andNeisseria meningitidis. We have shown in B. subtilis that in the absenceof the coaA gene product (ΔcoaA mutant), the coaX gene product isessential, providing the only pantothenate kinase activity required forthe synthesis of the essential compound, Coenzyme A. Therefore it can bepredicted that the pantothenate kinase encoded by the coaX homolog inthe above listed pathogens is an essential enzyme for each mentionedpathogen and is required for the survival and growth of the pathogen. Infact it has been reported that the coaX homolog in Bordetella pertussis,called baf, and classified as an auxiliary regulatory factor rather thana critical enzyme, is an essential gene (see Wood, G. E. and R. L.Friedman (2000) FEMS Microbial. Lett. 193(1):25-30).

[0115] The CoaX protein is a favorable target for the development andscreening of new antibiotics. First, the pantothenate kinase encoded bythe coaX gene is an essential enzyme in a group of human pathogens,making it a good target for inactivation. Second, the enzyme activity(kinase) of the isolated CoaX protein or its homologs provides an idealassay to screen large numbers of compounds (combinatorial libraries,etc.) for their ability to specifically inhibit the pantothenate kinaseactivity both in vitro and in vivo.

Example VIII Production of CoaX Proteins From Pathogens for use inScreening Assays

[0116] To provide the pantothenate kinase proteins for screening assays,the coaX gene homolog was obtained by PCR from isolated, whole genomeDNA of Helicobacter pylori (ATCC 700392), Borrelia burgdorferi (ATCC35210), Bordetella pertussis (ATCC 9797), and Pseudomonas aeruginosa(ATCC 47085). Coding sequences for proteins with homology to B. subtilisCoaX were amplified by PCR using the primers and templates given inTable 3 with Pfx DNA polymerase (Life Technologies) according to themanufacture's specifications. The PCR primers incorporate a XbaIrestriction enzyme recognition site at the 5′ end of each product and aBamHI restriction enzyme recognition site at the 3′ end of each product.PCR products were digested with a mixture of XbaI and BamHI and thenpurified by preparative agarose gel electrophoresis. TABLE 3 PCR primersand template DNAs used to amplify coding sequences homologous to B.subtilis coaX. coaX Template Organism homolog DNA 5′ amplificationprimer 3′ amplification primer Bacillus subtilis 168 yacB Strain RL-1TP175 TP176 genomic DNA 5′-GGGTCTAGAAAAGGAGGAA 5′-GGGATCCTTATACACTTCCTTTTAAATGTTACTGGTTATCGA ACGCGGTTTCTTTCATAAATC TGTGGGGAACACC-3′ AATTCC-3′Bordetella pertussis baf Strain TP177 TP178 ATCC 97975′-GGGTCTAGAAAAGGAGGAA 5′-GGGATCCTTAGGCCGTTGG genomic DNATTTAAATGATTATCCTCATCGA CGCGCCTTGCGCGGCG-3′ CTCCGGC-3′ Borreliaburgdorferi BB0527 Strain TP171 TP172 ATCC 35210 5′-GGGTCTAGAAAAGGAGGAA5′-GGGATCCTTAATTAACAAA genomic DNA TTTAAATGAATAAACCTTTATTCTTAAAGTCAATAGAATTTCC ATCAGAATTGATAATTGATATT TAAAATTCTAACGCCTTCTACGGAAATACCAGC-3′ AG-3′ Helicobacter pylori 26695 HP0862 Strain TP167TP168 ATCC 700392 5′-GGGTCTAGAAAAGGAGGAA 5′-GGGATCCTTATTTGCATTCT genomicDNA TTTAAATGCCAGCTAGGCAATC AGTATCCCTGCTTTTTAAGAG TTTTACAGATTTGAAAAACCTGCGATTTCCATCCCGTC-3′ G-3′ Pseudomonas aeruginosa PA4279 Strain TP169TP170 PA01 ATCC 47085 5′-GGGTCTAGAAAAGGAGGAA 5′-GGGATCCTTACTCAATCGGgenomic DNA TTTAAATGATTCTTGAGCTCGA GCAAGCCAGTGCCAGCCCTACCTGTGGAAACTCGCTG-3′ G-3′

[0117] The purified PCR products were cloned by ligation with plasmidvector ASK-1BA3 (Sigma-Genosys) which had been digested with XbaI andBamHI followed by transformation into strains LH-1 and XL1-Blue/MRF'kan.Plasmids containing inserts were identified by restriction enzymedigestion of plasmid DNA isolated from selected transformants. Examplesof plasmids containing the H. pylori (pOTP72), P. aeruginosa (pOTP73),or B. subtilis (pOTP71) coaY gene are shown in FIGS. 8, 9 and 10,respectively. The identity of inserts in plasmids is confirmed by DNAsequence analysis.

[0118] The pantothenate kinase activity of each of the above cloned coaXhomologs can be demonstrated by transforming the plasmids describedabove into E. coli strain YH1 containing the coaA15(Ts) mutation andlooking for complementation at the non-permissive temperature of 43°-44°C. For example, as shown in Table 4, transformation of E. coli YH1containing the coaA15(Ts) with plasmid pOTP72 containing the cloned H.pylori coaX gene (HP0862) or plasmid pOTP73 containing the cloned P.aeruginosa coaX gene (PA4279) enabled the E. coli cells with thetemperature sensitive coaA gene product to grow at 44° C. as is also thecase when these cells were transformed with the plasmid containing theB. subtilis coaX gene (pOTP71). These experiments confirm that the coaXhomologs in H. pylori and P. aeruginosa due indeed each encode an activepantothenate kinase. TABLE 4 Transformation of YH1 (coaA15(Ts)) withcoaX ligation mixtures and control plasmid DNA Number of colonies Numberof colonies DNA at 30° C. at 44° C. NONE zero zero Ligated, cut vector 5 zero Uncut vector >500   zero (pASK-1BA3) B. subtilis coaX,  74  67pool A ligation B. subtilis coaX, 230 160 pool B ligation H. pylori coaX 53  38 (HP0862) pool A ligation H. pylori coaX  99  56 (HP0862) pool Bligation P. aeruginosa coaX 366 279 (PA4279) pool A ligation P.aeruginosa coaX 282 359 (PA4279) pool B ligation

[0119] Since the coaX homologs cloned in pASK-1BA3 were inserteddownstream of a Tet-inducible promoter, enzyme for in vitro screeningassays can be obtained by inducing gene expression as described bySigma-Genosys, and then isolating the overproduced pantothenate kinaseby conventional protein purification procedure. Alternatively, the coaXgene can be cloned into any of various protein or peptide fusionexpression vectors that facilitate purification of the protein. Forexample, Helicobacter pylori, Borrelia burgdorferi, Bordetellapertussis, and Pseudomonas aeruginosa coaX genes can be cloned intoprotein fusion expression vectors such as those available from companiesincluding but not limited to Qiagen™ or Invitrogen™ to produce a Histagged CoaX fusion proteins or glutathione-S-transferase/CoaX fusionproteins which can be isolated by binding to nickel affinity orglutathione sepharose resins, respectively.

[0120] Equivalents

[0121] Those skilled in the art will recognize, or be able to ascertainusing no more than routine experimentation, many equivalents to thespecific embodiments of the invention described herein. Such equivalentsare intended to be encompassed by the following claims.

1 77 1 777 DNA Bacillus subtilis CDS (1)..(774) 1 ttg tta ctg gtt atcgat gtg ggg aac acc aat act gta ctt ggt gta 48 Leu Leu Leu Val Ile AspVal Gly Asn Thr Asn Thr Val Leu Gly Val 1 5 10 15 tat cat gat gga aaatta gaa tat cac tgg cgt ata gaa aca agc agg 96 Tyr His Asp Gly Lys LeuGlu Tyr His Trp Arg Ile Glu Thr Ser Arg 20 25 30 cat aaa aca gaa gat gagttt ggg atg att ttg cgc tcc tta ttt gat 144 His Lys Thr Glu Asp Glu PheGly Met Ile Leu Arg Ser Leu Phe Asp 35 40 45 cac tcc ggg ctt atg ttt gaacag ata gat ggc att att att tcg tca 192 His Ser Gly Leu Met Phe Glu GlnIle Asp Gly Ile Ile Ile Ser Ser 50 55 60 gta gtg ccg cca atc atg ttt gcgtta gaa aga atg tgc aca aaa tac 240 Val Val Pro Pro Ile Met Phe Ala LeuGlu Arg Met Cys Thr Lys Tyr 65 70 75 80 ttt cat atc gag cct caa att gttggt cca ggt atg aaa acc ggt tta 288 Phe His Ile Glu Pro Gln Ile Val GlyPro Gly Met Lys Thr Gly Leu 85 90 95 aat ata aaa tat gac aat ccg aaa gaagta ggg gca gac aga atc gta 336 Asn Ile Lys Tyr Asp Asn Pro Lys Glu ValGly Ala Asp Arg Ile Val 100 105 110 aat gct gtc gct gcg ata cac ttg tacggc aat cca tta att gtt gtc 384 Asn Ala Val Ala Ala Ile His Leu Tyr GlyAsn Pro Leu Ile Val Val 115 120 125 gat ttc gga acc gcc aca acg tac tgctat att gat gaa aac aaa caa 432 Asp Phe Gly Thr Ala Thr Thr Tyr Cys TyrIle Asp Glu Asn Lys Gln 130 135 140 tac atg ggc ggg gcg att gcc cct gggatt aca att tcg aca gag gcg 480 Tyr Met Gly Gly Ala Ile Ala Pro Gly IleThr Ile Ser Thr Glu Ala 145 150 155 160 ctt tac tcg cgt gca gca aag cttcct cgt atc gaa atc acc cgg ccc 528 Leu Tyr Ser Arg Ala Ala Lys Leu ProArg Ile Glu Ile Thr Arg Pro 165 170 175 gac aat att atc gga aaa aac actgtt agc gcg atg caa tct gga att 576 Asp Asn Ile Ile Gly Lys Asn Thr ValSer Ala Met Gln Ser Gly Ile 180 185 190 tta ttt ggc tat gtc ggc caa gtggaa gga atc gtt aag cga atg aaa 624 Leu Phe Gly Tyr Val Gly Gln Val GluGly Ile Val Lys Arg Met Lys 195 200 205 tgg cag gca aaa cag gac ctc aaggtc att gcg aca gga ggc ctg gcg 672 Trp Gln Ala Lys Gln Asp Leu Lys ValIle Ala Thr Gly Gly Leu Ala 210 215 220 ccg ctc att gcg aac gaa tca gattgt ata gac atc gtt gat cca ttc 720 Pro Leu Ile Ala Asn Glu Ser Asp CysIle Asp Ile Val Asp Pro Phe 225 230 235 240 tta acc cta aaa ggg ctg gaattg att tat gaa aga aac cgc gta gga 768 Leu Thr Leu Lys Gly Leu Glu LeuIle Tyr Glu Arg Asn Arg Val Gly 245 250 255 agt gta tag 777 Ser Val 2258 PRT Bacillus subtilis 2 Leu Leu Leu Val Ile Asp Val Gly Asn Thr AsnThr Val Leu Gly Val 1 5 10 15 Tyr His Asp Gly Lys Leu Glu Tyr His TrpArg Ile Glu Thr Ser Arg 20 25 30 His Lys Thr Glu Asp Glu Phe Gly Met IleLeu Arg Ser Leu Phe Asp 35 40 45 His Ser Gly Leu Met Phe Glu Gln Ile AspGly Ile Ile Ile Ser Ser 50 55 60 Val Val Pro Pro Ile Met Phe Ala Leu GluArg Met Cys Thr Lys Tyr 65 70 75 80 Phe His Ile Glu Pro Gln Ile Val GlyPro Gly Met Lys Thr Gly Leu 85 90 95 Asn Ile Lys Tyr Asp Asn Pro Lys GluVal Gly Ala Asp Arg Ile Val 100 105 110 Asn Ala Val Ala Ala Ile His LeuTyr Gly Asn Pro Leu Ile Val Val 115 120 125 Asp Phe Gly Thr Ala Thr ThrTyr Cys Tyr Ile Asp Glu Asn Lys Gln 130 135 140 Tyr Met Gly Gly Ala IleAla Pro Gly Ile Thr Ile Ser Thr Glu Ala 145 150 155 160 Leu Tyr Ser ArgAla Ala Lys Leu Pro Arg Ile Glu Ile Thr Arg Pro 165 170 175 Asp Asn IleIle Gly Lys Asn Thr Val Ser Ala Met Gln Ser Gly Ile 180 185 190 Leu PheGly Tyr Val Gly Gln Val Glu Gly Ile Val Lys Arg Met Lys 195 200 205 TrpGln Ala Lys Gln Asp Leu Lys Val Ile Ala Thr Gly Gly Leu Ala 210 215 220Pro Leu Ile Ala Asn Glu Ser Asp Cys Ile Asp Ile Val Asp Pro Phe 225 230235 240 Leu Thr Leu Lys Gly Leu Glu Leu Ile Tyr Glu Arg Asn Arg Val Gly245 250 255 Ser Val 3 250 PRT Clostridium acetobutylicum 3 Asn Lys ArgAla Ala Phe Met Leu Leu Leu Phe Leu Arg Ser Val Leu 1 5 10 15 Lys ValIle Leu Val Leu Asp Val Gly Asn Thr Asn Ile Val Leu Gly 20 25 30 Ile TyrAsn Asp Thr Lys Leu Thr Ala Glu Trp Arg Leu Ser Thr Asp 35 40 45 Val LeuArg Ser Ala Asp Glu Tyr Gly Ile Gln Val Met Asn Leu Phe 50 55 60 Gln GlnAsp Lys Leu Asp Pro Thr Leu Val Glu Gly Val Ile Ile Ser 65 70 75 80 SerVal Val Pro Asn Ile Met Tyr Ser Leu Glu His Met Ile Arg Lys 85 90 95 TyrPhe Lys Ile Asn Pro Leu Val Val Gly Pro Gly Ile Lys Thr Gly 100 105 110Ile Asn Ile Lys Tyr Asp Asn Pro Lys Glu Val Gly Ala Asp Arg Ile 115 120125 Val Asn Ala Val Ala Ala His Glu Ile Tyr Lys Arg Ser Leu Ile Ile 130135 140 Ile Asp Phe Gly Thr Ala Thr Thr Phe Cys Ala Val Arg Glu Asn Gly145 150 155 160 Asp Tyr Leu Gly Gly Ala Ile Cys Pro Gly Ile Lys Val SerSer Glu 165 170 175 Ala Leu Phe Glu Lys Ala Ala Lys Leu Pro Arg Val GluLeu Ile Lys 180 185 190 Pro Ala Tyr Ala Ile Cys Lys Asn Thr Ile Ser SerIle Gln Ser Gly 195 200 205 Ile Val Tyr Arg Tyr Leu Arg Gln Val Lys TyrLeu Phe Glu Lys Leu 210 215 220 Lys Glu Asn Leu Pro Asp Gly Arg Arg ThrArg Thr Ser Leu Val Leu 225 230 235 240 Ala Thr Gly Gly Leu Ala Lys LeuIle Asn 245 250 4 265 PRT Streptomyces coelicolor 4 Met Leu Leu Thr IleAsp Val Gly Asn Thr His Thr Val Leu Gly Leu 1 5 10 15 Phe Asp Gly GluAsp Ile Val Glu His Trp Arg Ile Ser Thr Asp Ser 20 25 30 Arg Arg Thr AlaAsp Glu Leu Ala Val Leu Leu Gln Gly Leu Met Gly 35 40 45 Met His Pro LeuLeu Gly Asp Glu Leu Gly Asp Gly Ile Asp Gly Ile 50 55 60 Ala Ile Cys AlaThr Val Pro Ser Val Leu His Glu Leu Arg Glu Val 65 70 75 80 Thr Arg ArgTyr Tyr Gly Asp Val Pro Ala Val Leu Val Glu Pro Gly 85 90 95 Val Lys ThrGly Val Pro Ile Leu Thr Asp His Pro Lys Glu Val Gly 100 105 110 Ala AspArg Ile Ile Asn Ala Val Ala Ala Val Glu Leu Tyr Gly Gly 115 120 125 ProAla Ile Val Val Asp Phe Gly Thr Ala Thr Thr Phe Asp Ala Val 130 135 140Ser Ala Arg Gly Glu Tyr Ile Gly Gly Val Ile Ala Pro Gly Ile Glu 145 150155 160 Ile Ser Val Glu Ala Leu Gly Val Lys Gly Ala Gln Leu Arg Lys Ile165 170 175 Glu Val Ala Arg Pro Arg Ser Val Ile Gly Lys Asn Thr Val GluAla 180 185 190 Met Gln Ser Gly Ile Val Tyr Gly Phe Ala Gly Gln Val AspGly Val 195 200 205 Val Asn Arg Met Ala Arg Glu Leu Ala Asp Asp Pro AspAsp Val Thr 210 215 220 Val Ile Ala Thr Gly Gly Leu Ala Pro Met Val LeuGly Glu Ser Ser 225 230 235 240 Val Ile Asp Glu His Glu Pro Trp Leu ThrLeu Met Gly Leu Arg Leu 245 250 255 Val Tyr Glu Arg Asn Val Ser Arg Met260 265 5 272 PRT Mycobacterium tuberculosis 5 Met Leu Leu Ala Ile AspVal Arg Asn Thr His Thr Val Val Gly Leu 1 5 10 15 Leu Ser Gly Met LysGlu His Ala Lys Val Val Gln Gln Trp Arg Ile 20 25 30 Arg Thr Glu Ser GluVal Thr Ala Asp Glu Leu Ala Leu Thr Ile Asp 35 40 45 Gly Leu Ile Gly GluAsp Ser Glu Arg Leu Thr Gly Thr Ala Ala Leu 50 55 60 Ser Thr Val Pro SerVal Leu His Glu Val Arg Ile Met Leu Asp Gln 65 70 75 80 Tyr Trp Pro SerVal Pro His Val Leu Ile Glu Pro Gly Val Arg Thr 85 90 95 Gly Ile Pro LeuLeu Val Asp Asn Pro Lys Glu Val Gly Ala Asp Arg 100 105 110 Ile Val AsnCys Leu Ala Ala Tyr Asp Arg Phe Arg Lys Ala Ala Ile 115 120 125 Val ValAsp Phe Gly Ser Ser Ile Cys Val Asp Val Val Ser Ala Lys 130 135 140 GlyGlu Phe Leu Gly Gly Ala Ile Ala Pro Gly Val Gln Val Ser Ser 145 150 155160 Asp Ala Ala Ala Ala Arg Ser Ala Ala Leu Arg Arg Val Glu Leu Ala 165170 175 Arg Pro Arg Ser Val Val Gly Lys Asn Thr Val Glu Cys Met Gln Ala180 185 190 Gly Ala Val Phe Gly Phe Ala Gly Leu Val Asp Gly Leu Val GlyArg 195 200 205 Ile Arg Glu Asp Val Ser Gly Phe Ser Val Asp His Asp ValAla Ile 210 215 220 Val Ala Thr Gly His Thr Ala Pro Leu Leu Leu Pro GluLeu His Thr 225 230 235 240 Val Asp His Tyr Asp Gln His Leu Thr Leu GlnGly Leu Arg Leu Val 245 250 255 Phe Glu Arg Asn Leu Glu Val Gln Arg GlyArg Leu Lys Thr Ala Arg 260 265 270 6 258 PRT Rhodobacter capsulatus 6Met Leu Leu Cys Ile Asp Cys Gly Asn Thr Asn Thr Val Phe Ser Val 1 5 1015 Trp Asp Gly Thr Asp Phe Ala Ala Thr Trp Arg Ile Ala Thr Asp His 20 2530 Arg Arg Thr Ala Asp Glu Tyr Phe Val Trp Leu Asn Thr Leu Met Gln 35 4045 Leu Lys Gly Leu Gln Gly Arg Ile Ser Glu Ala Ile Ile Ser Ser Thr 50 5560 Ala Pro Arg Val Val Phe Asn Leu Arg Val Leu Cys Asn Arg Tyr Phe 65 7075 80 Asp Cys Arg Pro Tyr Val Val Gly Lys Pro Gly Cys Glu Leu Pro Val 8590 95 Ala Pro Arg Val Asp Pro Gly Thr Thr Val Gly Pro Asp Arg Leu Val100 105 110 Asn Thr Val Ala Gly Tyr Asp Arg His Gly Gly Asp Leu Ile ValVal 115 120 125 Asp Phe Gly Thr Ala Thr Thr Phe Asp Val Val Ala Pro AspGly Ala 130 135 140 Tyr Ile Gly Gly Val Ile Ala Pro Gly Val Asn Leu SerLeu Glu Ala 145 150 155 160 Leu His Met Ala Ala Ala Ala Leu Pro His ValAsp Val Thr Lys Pro 165 170 175 Gln Gly Val Ile Gly Thr Asn Thr Val AlaCys Ile Gln Ser Gly Val 180 185 190 Tyr Trp Gly Tyr Ile Gly Leu Val GluGly Ile Val Arg Gln Ile Arg 195 200 205 Met Glu Arg Asp Arg Pro Met LysVal Ile Ala Thr Gly Gly Leu Ala 210 215 220 Ser Leu Phe Asp Leu Gly PheAsp Leu Phe Asp Lys Val Glu Asp Asp 225 230 235 240 Leu Thr Met His GlyLeu Arg Leu Ile Phe Asp Tyr Asn Lys Gly Leu 245 250 255 Gly Ala 7 255PRT Geobacter sulfurreducens 7 Met Leu Leu Val Ile Asp Val Gly Asn ThrAsn Ile Val Leu Gly Ile 1 5 10 15 Tyr Asp Gly Glu Arg Leu Val Arg AspTrp Arg Val Ser Thr Asp Lys 20 25 30 Ala Arg Thr Thr Asp Glu Tyr Gly IleLeu Ile Asn Glu Leu Phe Arg 35 40 45 Leu Ala Gly Leu Gly Leu Asp Gln IleArg Ala Val Ile Ile Ser Ser 50 55 60 Val Val Pro Pro Leu Thr Gly Val LeuGlu Arg Leu Ser Leu Gly Tyr 65 70 75 80 Phe Gly Met Arg Pro Leu Val ValGly Pro Gly Ile Lys Thr Gly Met 85 90 95 Pro Ile Gln Tyr Asp Asn Pro ArgGlu Val Gly Ala Asp Arg Ile Val 100 105 110 Asn Ala Val Ala Gly Tyr GluLys Tyr Arg Thr Ser Leu Ile Ile Val 115 120 125 Asp Phe Gly Thr Ala ThrThr Phe Asp Tyr Val Asn Arg Lys Gly Glu 130 135 140 Tyr Cys Gly Gly AlaIle Ala Pro Gly Leu Val Ile Ser Thr Glu Ala 145 150 155 160 Leu Phe GlnArg Ala Ser Lys Leu Pro Arg Val Asp Ile Ile Arg Pro 165 170 175 Ser AlaIle Ile Ala Arg Asn Thr Val Asn Ser Met Gln Ala Gly Ile 180 185 190 TyrTyr Gly Tyr Val Gly Leu Val Asp Glu Ile Val Thr Arg Met Lys 195 200 205Ala Glu Ser Lys Asp Ala Pro Arg Val Ile Ala Thr Gly Gly Leu Ala 210 215220 Ser Leu Ile Ala Pro Glu Ser Lys Thr Ile Glu Ala Val Glu Glu Tyr 225230 235 240 Leu Thr Leu Glu Gly Leu Arg Ile Leu Tyr Glu Arg Asn Arg Glu245 250 255 8 262 PRT Deinococcus radiopugnans 8 Met Pro Ala Phe Pro LeuLeu Ala Val Asp Ile Gly Asn Thr Thr Thr 1 5 10 15 Val Leu Gly Leu AlaAsp Ala Ser Gly Ala Leu Thr His Thr Trp Arg 20 25 30 Ile Arg Thr Asn ArgGlu Met Leu Pro Asp Asp Leu Ala Leu Gln Leu 35 40 45 His Gly Leu Phe ThrLeu Ala Gly Ala Pro Ile Pro Arg Ala Ala Val 50 55 60 Leu Ser Ser Val AlaPro Pro Val Gly Glu Asn Tyr Ala Leu Ala Leu 65 70 75 80 Lys Arg His PheMet Ile Asp Ala Phe Ala Val Ser Ala Glu Asn Leu 85 90 95 Pro Asp Val ThrVal Glu Leu Asp Thr Pro Gly Ser Val Gly Ala Asp 100 105 110 Arg Leu CysAsn Leu Phe Gly Ala Glu Lys Tyr Leu Gly Gly Leu Asp 115 120 125 Tyr AlaVal Val Val Asp Phe Gly Thr Ser Thr Asn Phe Asp Val Val 130 135 140 GlyArg Gly Arg Arg Phe Leu Gly Gly Ile Leu Ala Thr Gly Ala Gln 145 150 155160 Val Ser Ala Asp Ala Leu Phe Ala Arg Ala Ala Lys Leu Pro Arg Ile 165170 175 Thr Leu Gln Ala Pro Glu Thr Ala Ile Gly Lys Asn Thr Val His Ala180 185 190 Leu Gln Ser Gly Leu Val Phe Gly Tyr Ala Glu Met Val Asp GlyLeu 195 200 205 Leu Arg Arg Ile Arg Ala Glu Leu Pro Gly Glu Ala Val AlaVal Ala 210 215 220 Thr Gly Gly Phe Ser Arg Thr Val Gln Gly Ile Cys GlnGlu Ile Asp 225 230 235 240 Tyr Tyr Asp Glu Thr Leu Thr Leu Arg Gly LeuVal Glu Leu Trp Ala 245 250 255 Ser Arg Ser Glu Val Arg 260 9 246 PRTThermotoga maritima 9 Met Tyr Leu Leu Val Asp Val Gly Asn Thr His SerVal Phe Ser Ile 1 5 10 15 Thr Glu Asp Gly Lys Thr Phe Arg Arg Trp ArgLeu Ser Thr Gly Val 20 25 30 Phe Gln Thr Glu Asp Glu Leu Phe Ser His LeuHis Pro Leu Leu Gly 35 40 45 Asp Ala Met Arg Glu Ile Lys Gly Ile Gly ValAla Ser Val Val Pro 50 55 60 Thr Gln Asn Thr Val Ile Glu Arg Phe Ser GlnLys Tyr Phe His Ile 65 70 75 80 Ser Pro Ile Trp Val Lys Ala Lys Asn GlyCys Val Lys Trp Asn Val 85 90 95 Lys Asn Pro Ser Glu Val Gly Ala Asp ArgVal Ala Asn Val Val Ala 100 105 110 Phe Val Lys Glu Tyr Gly Lys Asn GlyIle Ile Ile Asp Met Gly Thr 115 120 125 Ala Thr Thr Val Asp Leu Val ValAsn Gly Ser Tyr Glu Gly Gly Ala 130 135 140 Ile Leu Pro Gly Phe Phe MetMet Val His Ser Leu Phe Arg Gly Thr 145 150 155 160 Ala Lys Leu Pro LeuVal Glu Val Lys Pro Ala Asp Phe Val Val Gly 165 170 175 Lys Asp Thr GluGlu Asn Ile Arg Leu Gly Val Val Asn Gly Ser Val 180 185 190 Tyr Ala LeuGlu Gly Ile Ile Gly Arg Ile Lys Glu Val Tyr Gly Asp 195 200 205 Leu ProVal Val Leu Thr Gly Gly Gln Ser Lys Ile Val Lys Asp Met 210 215 220 IleLys His Glu Ile Phe Asp Glu Asp Leu Thr Ile Lys Gly Val Tyr 225 230 235240 His Phe Cys Phe Gly Asp 245 10 273 PRT Treponema pallidum 10 Met LeuLeu Ile Asp Val Gly Asn Ser His Val Val Phe Gly Ile Gln 1 5 10 15 GlyGlu Asn Gly Gly Arg Val Cys Val Arg Glu Leu Phe Arg Leu Ala 20 25 30 ProAsp Ala Arg Lys Thr Gln Asp Glu Tyr Ser Leu Leu Ile His Ala 35 40 45 LeuCys Glu Arg Ala Gly Val Gly Arg Ala Ser Leu Arg Asp Ala Phe 50 55 60 IleSer Ser Val Val Pro Val Leu Thr Lys Thr Ile Ala Asp Ala Val 65 70 75 80Ala Gln Ile Ser Gly Val Gln Pro Val Val Phe Gly Pro Trp Ala Tyr 85 90 95Glu His Leu Pro Val Arg Ile Pro Glu Pro Val Arg Ala Glu Ile Gly 100 105110 Thr Asp Leu Val Ala Asn Ala Val Ala Ala Tyr Val His Phe Arg Ser 115120 125 Ala Cys Val Val Val Asp Cys Gly Thr Ala Leu Thr Phe Thr Ala Val130 135 140 Asp Gly Thr Gly Leu Ile Gln Gly Val Ala Ile Ala Pro Gly LeuArg 145 150 155 160 Thr Ala Val Gln Ser Leu His Thr Gly Thr Ala Gln LeuPro Leu Val 165 170 175 Pro Leu Ala Leu Pro Asp Ser Val Leu Gly Lys AspThr Thr His Ala 180 185 190 Val Gln Ala Gly Val Val Arg Gly Thr Leu PheVal Ile Arg Ala Met 195 200 205 Ile Ala Gln Cys Gln Lys Glu Leu Gly CysArg Cys Ala Ala Val Ile 210 215 220 Thr Gly Gly Leu Ser Arg Leu Phe SerSer Glu Val Asp Phe Pro Pro 225 230 235 240 Ile Asp Ala Gln Leu Thr LeuSer Gly Leu Ala His Ile Ala Arg Leu 245 250 255 Val Pro Thr Ser Leu LeuPro Pro Ala Thr Val Ser Gly Ser Ser Gly 260 265 270 Asn 11 262 PRTBorrelia burgdorferi 11 Met Asn Lys Pro Leu Leu Ser Glu Leu Ile Ile AspIle Gly Asn Thr 1 5 10 15 Ser Ile Ala Phe Ala Leu Phe Lys Asp Asn GlnVal Asn Leu Phe Ile 20 25 30 Lys Met Lys Thr Asn Leu Met Leu Arg Tyr AspGlu Val Tyr Ser Phe 35 40 45 Phe Glu Glu Asn Phe Asp Phe Asn Val Asn LysVal Phe Ile Ser Ser 50 55 60 Val Val Pro Ile Leu Asn Glu Thr Phe Lys AsnVal Ile Phe Ser Phe 65 70 75 80 Phe Lys Ile Lys Pro Leu Phe Ile Gly PheAsp Leu Asn Tyr Asp Leu 85 90 95 Thr Phe Asn Pro Tyr Lys Ser Asp Lys PheLeu Leu Gly Ser Asp Val 100 105 110 Phe Ala Asn Leu Val Ala Ala Ile GluAsn Tyr Ser Phe Glu Asn Val 115 120 125 Leu Val Val Asp Leu Gly Thr AlaCys Thr Ile Phe Ala Val Ser Arg 130 135 140 Gln Asp Gly Ile Leu Gly GlyIle Ile Asn Ser Gly Pro Leu Ile Asn 145 150 155 160 Phe Asn Ser Leu LeuAsp Asn Ala Tyr Leu Ile Lys Lys Phe Pro Ile 165 170 175 Ser Thr Pro AsnAsn Leu Leu Glu Arg Thr Thr Ser Gly Ser Val Asn 180 185 190 Ser Gly LeuPhe Tyr Gln Tyr Lys Tyr Leu Ile Glu Gly Val Tyr Arg 195 200 205 Asp IleLys Gln Met Tyr Lys Lys Lys Phe Asn Leu Ile Ile Thr Gly 210 215 220 GlyAsn Ala Asp Leu Ile Leu Ser Leu Ile Glu Ile Glu Phe Ile Phe 225 230 235240 Asn Ile His Leu Thr Val Glu Gly Val Arg Ile Leu Gly Asn Ser Ile 245250 255 Asp Phe Lys Phe Val Asn 260 12 229 PRT Aquifex aeolicus 12 MetArg Phe Leu Thr Val Asp Val Gly Asn Ser Ser Val Asp Ile Ala 1 5 10 15Leu Trp Glu Gly Lys Lys Val Lys Asp Phe Leu Lys Leu Ser His Glu 20 25 30Glu Phe Leu Lys Glu Glu Phe Pro Lys Leu Lys Ala Leu Gly Ile Ser 35 40 45Val Lys Gln Ser Phe Ser Glu Lys Val Arg Gly Lys Ile Pro Lys Ile 50 55 60Lys Phe Leu Lys Lys Glu Asn Phe Pro Ile Gln Val Asp Tyr Lys Thr 65 70 7580 Pro Glu Thr Leu Gly Thr Asp Arg Val Ala Leu Ala Tyr Ser Ala Lys 85 9095 Lys Phe Tyr Gly Lys Asn Val Val Val Ile Ser Ala Gly Thr Ala Leu 100105 110 Val Ile Asp Leu Val Leu Glu Gly Lys Phe Lys Gly Gly Phe Ile Thr115 120 125 Leu Gly Leu Gly Lys Lys Leu Lys Ile Leu Ser Asp Leu Ala GluGly 130 135 140 Ile Pro Glu Phe Phe Pro Glu Glu Val Glu Ile Phe Leu GlyArg Ser 145 150 155 160 Thr Arg Glu Cys Val Leu Gly Gly Ala Tyr Arg GluSer Thr Glu Phe 165 170 175 Ile Lys Ser Thr Leu Lys Leu Trp Arg Lys ValPhe Lys Arg Lys Phe 180 185 190 Lys Val Val Ile Thr Gly Gly Glu Gly LysTyr Phe Ser Lys Phe Gly 195 200 205 Ile Tyr Asp Pro Leu Leu Val His ArgGly Met Arg Asn Leu Leu Tyr 210 215 220 Leu Tyr His Arg Ile 225 13 257PRT Synechocystis sp. 13 Met Glu Thr Ser Lys Pro Gly Cys Gly Leu Ala LeuAsp Asn Asp Lys 1 5 10 15 Gln Lys Pro Trp Leu Gly Leu Met Ile Gly AsnSer Arg Leu His Trp 20 25 30 Ala Tyr Cys Ser Gly Asn Ala Pro Leu Gln ThrTrp Val Thr Asp Tyr 35 40 45 Asn Pro Lys Ser Ala Gln Leu Pro Val Leu LeuGly Lys Val Pro Leu 50 55 60 Met Leu Ala Ser Val Val Pro Glu Gln Thr GluVal Trp Arg Val Tyr 65 70 75 80 Gln Pro Lys Ile Leu Thr Leu Lys Asn LeuPro Leu Val Asn Leu Tyr 85 90 95 Pro Ser Phe Gly Ile Asp Arg Ala Leu AlaGly Leu Gly Thr Gly Leu 100 105 110 Thr Tyr Gly Phe Pro Cys Leu Val ValAsp Gly Gly Thr Ala Leu Thr 115 120 125 Ile Thr Gly Phe Asp Gln Asp LysLys Leu Val Gly Gly Ala Ile Leu 130 135 140 Pro Gly Leu Gly Leu Gln LeuAla Thr Leu Gly Asp Arg Leu Ala Ala 145 150 155 160 Leu Pro Lys Leu GluMet Asp Gln Leu Thr Glu Leu Pro Asp Arg Trp 165 170 175 Ala Leu Asp ThrPro Ser Ala Ile Phe Ser Gly Val Val Tyr Gly Val 180 185 190 Leu Gly AlaLeu Gln Ser Tyr Leu Gln Asp Trp Gln Lys Leu Phe Pro 195 200 205 Gly AlaAla Met Val Ile Thr Gly Gly Asp Gly Lys Ile Leu His Gly 210 215 220 PheLeu Lys Glu His Ser Pro Asn Leu Ser Val Ala Trp Asp Asp Asn 225 230 235240 Leu Ile Phe Leu Gly Met Ala Ala Ile His His Gly Asp Arg Pro Ile 245250 255 Cys 14 223 PRT Helicobacter pylori 14 Met Pro Ala Arg Gln SerPhe Thr Asp Leu Lys Asn Leu Val Leu Cys 1 5 10 15 Asp Ile Gly Asn ThrArg Ile His Phe Ala Gln Asn Tyr Gln Leu Phe 20 25 30 Ser Ser Ala Lys GluAsp Leu Lys Arg Leu Gly Ile Gln Lys Glu Ile 35 40 45 Phe Tyr Ile Ser ValAsn Glu Glu Asn Glu Lys Ala Leu Leu Asn Cys 50 55 60 Tyr Pro Asn Ala LysAsn Ile Ala Gly Phe Phe His Leu Glu Thr Asp 65 70 75 80 Tyr Val Gly LeuGly Ile Asp Arg Gln Met Ala Cys Leu Ala Val Asn 85 90 95 Asn Gly Val ValVal Asp Ala Gly Ser Ala Ile Thr Ile Asp Leu Ile 100 105 110 Lys Glu GlyLys His Leu Gly Gly Cys Ile Leu Pro Gly Leu Ala Gln 115 120 125 Tyr IleHis Ala Tyr Lys Lys Ser Ala Lys Ile Leu Glu Gln Pro Phe 130 135 140 LysAla Leu Asp Ser Leu Glu Val Leu Pro Lys Ser Thr Arg Asp Ala 145 150 155160 Val Asn Tyr Gly Met Val Leu Ser Val Ile Ala Cys Ile Gln His Leu 165170 175 Ala Lys Asn Gln Lys Ile Tyr Leu Cys Gly Gly Asp Ala Lys Tyr Leu180 185 190 Ser Ala Phe Leu Pro His Ser Val Cys Lys Glu Arg Leu Val PheAsp 195 200 205 Gly Met Glu Ile Ala Leu Lys Lys Ala Gly Ile Leu Glu CysLys 210 215 220 15 267 PRT Bordetella pertussis 15 Met Ile Ile Leu IleAsp Ser Gly Asn Ser Arg Leu Lys Val Gly Trp 1 5 10 15 Phe Asp Pro AspAla Pro Gln Ala Ala Arg Glu Pro Ala Pro Val Ala 20 25 30 Phe Asp Asn LeuAsp Leu Asp Ala Leu Gly Arg Trp Leu Ala Thr Leu 35 40 45 Pro Arg Arg ProGln Arg Ala Leu Gly Val Asn Val Ala Gly Leu Ala 50 55 60 Arg Gly Glu AlaIle Ala Ala Thr Leu Arg Ala Gly Gly Cys Asp Ile 65 70 75 80 Arg Trp LeuArg Ala Gln Pro Leu Ala Met Gly Leu Arg Asn Gly Tyr 85 90 95 Arg Asn ProAsp Gln Leu Gly Ala Asp Arg Trp Ala Cys Met Val Gly 100 105 110 Val LeuAla Arg Gln Pro Ser Val His Pro Pro Leu Leu Val Ala Ser 115 120 125 PheGly Thr Ala Thr Thr Leu Asp Thr Ile Gly Pro Asp Asn Val Phe 130 135 140Pro Gly Gly Leu Ile Leu Pro Gly Pro Ala Met Met Arg Gly Ala Leu 145 150155 160 Ala Tyr Gly Thr Ala His Leu Pro Leu Ala Asp Gly Leu Val Ala Asp165 170 175 Tyr Pro Ile Asp Thr His Gln Ala Ile Ala Ser Gly Ile Ala AlaAla 180 185 190 Gln Ala Gly Ala Ile Val Arg Gln Trp Leu Ala Gly Arg GlnArg Tyr 195 200 205 Gly Gln Ala Pro Glu Ile Tyr Val Ala Gly Gly Gly TrpPro Glu Val 210 215 220 Arg Gln Glu Ala Glu Arg Leu Leu Ala Val Thr GlyAla Ala Phe Gly 225 230 235 240 Ala Thr Pro Gln Pro Thr Tyr Leu Asp SerPro Val Leu Asp Gly Leu 245 250 255 Ala Ala Leu Ala Ala Gln Gly Ala ProThr Ala 260 265 16 702 DNA Bacillus subtilis CDS (1)..(699) 16 ttg ttactg gtt atc gat gtg ggg aac acc aat act gta ctt ggt gta 48 Met Leu LeuVal Ile Asp Val Gly Asn Thr Asn Thr Val Leu Gly Val 1 5 10 15 tat catgat gga aaa tta gaa tat cac tgg cgt ata gaa aca agc agg 96 Tyr His AspGly Lys Leu Glu Tyr His Trp Arg Ile Glu Thr Ser Arg 20 25 30 cat aaa acagaa gat gag ttt ggg atg att ttg cgc tcc tta ttt gat 144 His Lys Thr GluAsp Glu Phe Gly Met Ile Leu Arg Ser Leu Phe Asp 35 40 45 cac tcc ggg cttatg ttt gaa cag ata gat ggc att att att tcg tca 192 His Ser Gly Leu MetPhe Glu Gln Ile Asp Gly Ile Ile Ile Ser Ser 50 55 60 gta gtg ccg cca atcatg ttt gcg tta gaa aga atg tgc aca aaa tac 240 Val Val Pro Pro Ile MetPhe Ala Leu Glu Arg Met Cys Thr Lys Tyr 65 70 75 80 ttt cat atc gag cctcaa att gtt ggt cca ggt atg aaa acc ggt tta 288 Phe His Ile Glu Pro GlnIle Val Gly Pro Gly Met Lys Thr Gly Leu 85 90 95 aat ata aaa tat gac aatccg aaa gaa gta ggg gca gac aga atc gta 336 Asn Ile Lys Tyr Asp Asn ProLys Glu Val Gly Ala Asp Arg Ile Val 100 105 110 aat gct gtc gct gcg atacac ttg tac ggc aat cca tta att gtt gtc 384 Asn Ala Val Ala Ala Ile HisLeu Tyr Gly Asn Pro Leu Ile Val Val 115 120 125 gat ttc gga acc gcc acaacg tac tgc tat att gat gaa aac aaa caa 432 Asp Phe Gly Thr Ala Thr ThrTyr Cys Tyr Ile Asp Glu Asn Lys Gln 130 135 140 tac atg ggc ggg gcg attgcc cct ggg att aca att tcg aca gag gcg 480 Tyr Met Gly Gly Ala Ile AlaPro Gly Ile Thr Ile Ser Thr Glu Ala 145 150 155 160 ctt tac tcg cgt gcagca aag ctt cct cgt atc gaa atc acc cgg ccc 528 Leu Tyr Ser Arg Ala AlaLys Leu Pro Arg Ile Glu Ile Thr Arg Pro 165 170 175 gac aat att atc ggaaaa aac act gtt agc gcg atg caa tct gga att 576 Asp Asn Ile Ile Gly LysAsn Thr Val Ser Ala Met Gln Ser Gly Ile 180 185 190 tta ttt ggc tat gtcggc caa gtg gaa gga atc gtt aag cga atg aaa 624 Leu Phe Gly Tyr Val GlyGln Val Glu Gly Ile Val Lys Arg Met Lys 195 200 205 tgg cag gca aaa caggac cca agg tca ttg cga cag gag gcc tgg cgc 672 Trp Gln Ala Lys Gln AspPro Arg Ser Leu Arg Gln Glu Ala Trp Arg 210 215 220 cgc tca ttg cga acgaat cag att gta tag 702 Arg Ser Leu Arg Thr Asn Gln Ile Val 225 230 17233 PRT Bacillus subtilis 17 Met Leu Leu Val Ile Asp Val Gly Asn Thr AsnThr Val Leu Gly Val 1 5 10 15 Tyr His Asp Gly Lys Leu Glu Tyr His TrpArg Ile Glu Thr Ser Arg 20 25 30 His Lys Thr Glu Asp Glu Phe Gly Met IleLeu Arg Ser Leu Phe Asp 35 40 45 His Ser Gly Leu Met Phe Glu Gln Ile AspGly Ile Ile Ile Ser Ser 50 55 60 Val Val Pro Pro Ile Met Phe Ala Leu GluArg Met Cys Thr Lys Tyr 65 70 75 80 Phe His Ile Glu Pro Gln Ile Val GlyPro Gly Met Lys Thr Gly Leu 85 90 95 Asn Ile Lys Tyr Asp Asn Pro Lys GluVal Gly Ala Asp Arg Ile Val 100 105 110 Asn Ala Val Ala Ala Ile His LeuTyr Gly Asn Pro Leu Ile Val Val 115 120 125 Asp Phe Gly Thr Ala Thr ThrTyr Cys Tyr Ile Asp Glu Asn Lys Gln 130 135 140 Tyr Met Gly Gly Ala IleAla Pro Gly Ile Thr Ile Ser Thr Glu Ala 145 150 155 160 Leu Tyr Ser ArgAla Ala Lys Leu Pro Arg Ile Glu Ile Thr Arg Pro 165 170 175 Asp Asn IleIle Gly Lys Asn Thr Val Ser Ala Met Gln Ser Gly Ile 180 185 190 Leu PheGly Tyr Val Gly Gln Val Glu Gly Ile Val Lys Arg Met Lys 195 200 205 TrpGln Ala Lys Gln Asp Pro Arg Ser Leu Arg Gln Glu Ala Trp Arg 210 215 220Arg Ser Leu Arg Thr Asn Gln Ile Val 225 230 18 163 DNA ArtificialSequence Description of Artificial Sequencepromoter sequence 18gcctacctag cttccaagaa agatatccta acagcacaag agcggaaaga tgttttgttc 60tacatccaga acaacctctg ctaaaattcc tgaaaaattt tgcaaaaagt tgttgacttt 120atctacaagg tgtggtataa taatcttaac aacagcagga cgc 163 19 194 DNAArtificial Sequence Description of Artificial Sequencepromoter sequence19 gctattgacg acagctatgg ttcactgtcc accaaccaaa actgtgctca gtaccgccaa 60tatttctccc ttgaggggta caaagaggtg tccctagaag agatccacgc tgtgtaaaaa 120ttttacaaaa aggtattgac tttccctaca gggtgtgtaa taatttaatt acaggcgggg 180gcaaccccgc ctgt 194 20 248 PRT Pseudomonas aeruginosa 20 Met Ile Leu GluLeu Asp Cys Gly Asn Ser Leu Ile Lys Trp Arg Val 1 5 10 15 Ile Glu GlyAla Ala Arg Ser Val Ala Gly Gly Leu Ala Glu Ser Asp 20 25 30 Asp Ala LeuVal Glu Gln Leu Thr Ser Gln Gln Ala Leu Pro Val Arg 35 40 45 Ala Cys ArgLeu Val Ser Val Arg Ser Glu Gln Glu Thr Ser Gln Leu 50 55 60 Val Ala ArgLeu Glu Gln Leu Phe Pro Val Ser Ala Leu Val Ala Ser 65 70 75 80 Ser GlyLys Gln Leu Ala Gly Val Arg Asn Gly Tyr Leu Asp Tyr Gln 85 90 95 Arg LeuGly Leu Asp Arg Trp Leu Ala Leu Val Ala Ala His His Leu 100 105 110 AlaLys Lys Ala Cys Leu Val Ile Asp Leu Gly Thr Ala Val Thr Ser 115 120 125Asp Leu Val Ala Ala Asp Gly Val His Leu Gly Gly Tyr Ile Cys Pro 130 135140 Gly Met Thr Leu Met Arg Ser Gln Leu Arg Thr His Thr Arg Arg Ile 145150 155 160 Arg Tyr Asp Asp Ala Glu Ala Arg Arg Ala Leu Ala Ser Leu GlnPro 165 170 175 Gly Gln Ala Thr Ala Glu Ala Val Glu Arg Gly Cys Leu LeuMet Leu 180 185 190 Arg Gly Phe Val Arg Glu Gln Tyr Ala Met Ala Cys GluLeu Leu Gly 195 200 205 Pro Asp Cys Glu Ile Phe Leu Thr Gly Gly Asp AlaGlu Leu Val Arg 210 215 220 Asp Glu Leu Ala Gly Ala Arg Ile Met Pro AspLeu Val Phe Val Gly 225 230 235 240 Leu Ala Leu Ala Cys Pro Ile Glu 24521 209 PRT Campylobacter jejuni 21 Met Leu Leu Cys Asp Ile Gly Asn SerAsn Ala Asn Phe Leu Asp Asp 1 5 10 15 Asn Lys Tyr Phe Thr Leu Asn IleAsp Gln Phe Leu Glu Phe Lys Asn 20 25 30 Glu Gln Lys Ile Phe Tyr Ile AsnVal Asn Glu His Leu Lys Glu His 35 40 45 Leu Lys Asn Gln Lys Asn Phe IleAsn Leu Glu Pro Tyr Phe Leu Phe 50 55 60 Asp Thr Ile Tyr Gln Gly Leu GlyIle Asp Arg Ile Ala Ala Cys Tyr 65 70 75 80 Thr Ile Glu Asp Gly Val ValVal Asp Ala Gly Ser Ala Ile Thr Ile 85 90 95 Asp Ile Ile Ser Asn Ser IleHis Leu Gly Gly Phe Ile Leu Pro Gly 100 105 110 Ile Ala Asn Tyr Lys LysIle Tyr Ser His Ile Ser Pro Arg Leu Lys 115 120 125 Ser Glu Phe Asn ThrGln Val Ser Leu Asp Ala Phe Pro Gln Lys Thr 130 135 140 Met Asp Ala LeuSer Tyr Gly Val Phe Lys Gly Ile Tyr Leu Leu Ile 145 150 155 160 Lys AspAla Ala Gln Asn Lys Lys Leu Tyr Phe Thr Gly Gly Asp Gly 165 170 175 GlnPhe Leu Ala Asn Tyr Phe Asp His Ala Ile Tyr Asp Lys Leu Leu 180 185 190Ile Phe Arg Gly Met Lys Lys Ile Ile Lys Glu Asn Pro Asn Leu Leu 195 200205 Tyr 22 592 PRT Neisseria meningitidis 22 Met Thr Val Leu Lys Pro SerHis Trp Arg Val Leu Ala Glu Leu Ala 1 5 10 15 Asp Gly Leu Pro Gln HisVal Ser Gln Leu Ala Arg Met Ala Asp Met 20 25 30 Lys Pro Gln Gln Leu AsnGly Phe Trp Gln Gln Met Pro Ala His Ile 35 40 45 Arg Gly Leu Leu Arg GlnHis Asp Gly Tyr Trp Arg Leu Val Arg Pro 50 55 60 Leu Ala Val Phe Asp AlaGlu Gly Leu Arg Glu Leu Gly Glu Arg Ser 65 70 75 80 Gly Phe Gln Thr AlaLeu Lys His Glu Cys Ala Ser Ser Asn Asp Glu 85 90 95 Ile Leu Glu Leu AlaArg Ile Ala Pro Asp Lys Ala His Lys Thr Ile 100 105 110 Cys Val Thr HisLeu Gln Ser Lys Gly Arg Gly Arg Gln Gly Arg Lys 115 120 125 Trp Ser HisArg Leu Gly Glu Cys Leu Met Phe Ser Phe Gly Trp Val 130 135 140 Phe AspArg Pro Gln Tyr Glu Leu Gly Ser Leu Ser Pro Val Ala Ala 145 150 155 160Val Ala Cys Arg Arg Ala Leu Ser Arg Leu Gly Leu Lys Thr Gln Ile 165 170175 Lys Trp Pro Asn Asp Leu Val Val Gly Arg Asp Lys Leu Gly Gly Ile 180185 190 Leu Ile Glu Thr Val Arg Thr Gly Gly Lys Thr Val Ala Val Val Gly195 200 205 Ile Gly Ile Asn Phe Val Leu Pro Lys Glu Val Glu Asn Ala AlaSer 210 215 220 Val Gln Ser Leu Phe Gln Thr Ala Ser Arg Arg Gly Asn AlaAsp Ala 225 230 235 240 Ala Val Leu Leu Glu Thr Leu Leu Ala Glu Leu AspAla Val Leu Leu 245 250 255 Gln Tyr Ala Arg Asp Gly Phe Ala Pro Phe ValAla Glu Tyr Gln Ala 260 265 270 Ala Asn Arg Asp His Gly Lys Ala Val LeuLeu Leu Arg Asp Gly Glu 275 280 285 Thr Val Phe Glu Gly Thr Val Lys GlyVal Asp Gly Gln Gly Val Leu 290 295 300 His Leu Glu Thr Ala Glu Gly LysGln Thr Val Val Ser Gly Glu Ile 305 310 315 320 Ser Leu Arg Ser Asp AspArg Pro Val Ser Val Pro Lys Arg Arg Asp 325 330 335 Ser Glu Arg Phe LeuLeu Leu Asp Gly Gly Asn Ser Arg Leu Lys Trp 340 345 350 Ala Trp Val GluAsn Gly Thr Phe Ala Thr Val Gly Ser Ala Pro Tyr 355 360 365 Arg Asp LeuSer Pro Leu Gly Ala Glu Trp Ala Glu Lys Val Asp Gly 370 375 380 Asn ValArg Ile Val Gly Cys Ala Val Cys Gly Glu Phe Lys Lys Ala 385 390 395 400Gln Val Gln Glu Gln Leu Ala Arg Lys Ile Glu Trp Leu Pro Ser Ser 405 410415 Ala Gln Ala Leu Gly Ile Arg Asn His Tyr Arg His Pro Glu Glu His 420425 430 Gly Ser Asp Arg Trp Phe Asn Ala Leu Gly Ser Arg Arg Phe Ser Arg435 440 445 Asn Ala Cys Val Val Val Ser Cys Gly Thr Ala Val Thr Val AspAla 450 455 460 Leu Thr Asp Asp Gly His Tyr Leu Gly Gly Thr Ile Met ProGly Phe 465 470 475 480 His Leu Met Lys Glu Ser Leu Ala Val Arg Thr AlaAsn Leu Asn Arg 485 490 495 His Ala Gly Lys Arg Tyr Pro Phe Pro Thr ThrThr Gly Asn Ala Val 500 505 510 Ala Ser Gly Met Met Asp Ala Val Cys GlySer Val Met Met Met His 515 520 525 Gly Arg Leu Lys Glu Lys Thr Gly AlaGly Lys Pro Val Asp Val Ile 530 535 540 Ile Thr Gly Gly Gly Ala Ala LysVal Ala Glu Ala Leu Pro Pro Ala 545 550 555 560 Phe Leu Ala Glu Asn ThrVal Arg Val Ala Asp Asn Leu Val Ile His 565 570 575 Gly Leu Leu Asn LeuIle Ala Ala Glu Gly Gly Glu Ser Glu His Thr 580 585 590 23 753 DNAClostridium acetobutylicum 23 aataagagag cagcttttat gctgctcttatttttaagga gtgtattaaa agtgatttta 60 gttttagatg ttggcaatac taatatagtgttaggaatat acaatgatac gaaacttaca 120 gctgaatgga gactatcaac agatgtattaagatctgctg acgaatatgg aattcaagta 180 atgaacttat ttcaacaaga taagctcgatccaacattag ttgagggagt aataatatcc 240 tctgttgtac ctaatatcat gtattctttagaacatatga taagaaagta ctttaagata 300 aatccattag ttgttggacc tggaataaaaacaggaatta atattaaata cgataatcct 360 aaagaagttg gagccgacag aattgtaaatgctgtagcag cacatgaaat ttataaaaga 420 tctcttataa taatagattt tggaacagcaactacatttt gtgcagtaag agaaaatgga 480 gattatcttg gtggagcaat atgccctggaattaaagttt catcagaggc tctttttgaa 540 aaggcagcta agcttccaag agtagagctcataaaaccag cgtatgctat ttgtaaaaat 600 actatttcaa gtatacaatc tggaattgtttatcgatacc tacgtcaggt aaaatactta 660 tttgaaaaat tgaaagaaaa cctgccggacggaaggagaa caaggacctc cttggtattg 720 gccacaggtg gtcttgccaa acttattaattga 753 24 798 DNA Streptomyces coelicolor 24 atgctgctga cgatcgacgtagggaacacg cacaccgtcc tcggcctctt cgacggcgag 60 gacatcgtcg agcactggcgcatctccacg gactcgcgcc gcacggccga cgaactggcg 120 gtgctcctcc agggcctcatgggcatgcat cccctcctcg gcgacgaact gggcgacggc 180 atcgacggca tcgccatctgcgcgacggtc ccctccgtcc tccacgaact gcgcgaggtc 240 acccgccgct actacggcgacgtccccgcg gtcctcgtcg aaccgggcgt caagaccggc 300 gtcccgatcc tcaccgaccaccccaaggag gtcggcgccg accgcatcat caacgcggta 360 gcggccgtgg agctctacggcggcccggcg atcgtcgtgg acttcggcac ggcgacgacg 420 ttcgacgcgg tcagcgcgcgcggggagtac atcggcggcg tcatcgcccc cggcatcgag 480 atctcggtcg aggcgctgggcgtcaagggc gcccagctcc gcaagatcga ggtggcgcgc 540 ccccgcagcg tgatcggcaagaacacggtc gaggcgatgc agtccggcat cgtgtacggc 600 ttcgccggcc aggtcgacggcgtcgtcaac cgcatggcgc gggagctggc cgacgacccg 660 gacgacgtga cggtcatcgcgacgggcggg ctggcgccga tggtcctggg cgagtcctcg 720 gtcatcgacg agcacgagccgtggctgacg ctgatgggtc tgcgcctggt gtacgagcgc 780 aacgtgtcgc gcatgtag 79825 819 DNA Mycobacterium tuberculosis 25 gtgctgctgg cgattgacgtccgcaacacc cacaccgttg tgggcctgct gtccggaatg 60 aaagagcacg caaaggtcgtgcagcagtgg cggatacgca ccgaatccga agtcaccgcc 120 gacgaactgg cactgacgatcgacgggctg atcggcgagg attccgagcg gctcaccggt 180 accgccgcct tgtccacggtcccgtccgtg ctgcacgagg tgcggataat gctcgaccag 240 tactggccgt cggtgccgcacgtgctgatc gagcccggag tacgcaccgg gatccctttg 300 ctcgtcgaca acccgaaggaagtgggcgca gaccgcatcg tgaactgttt ggccgcctat 360 gaccggttcc ggaaggccgccatcgtcgtt gactttggat cctcgatctg tgttgatgtt 420 gtatcggcca agggtgaatttcttggcggc gccatcgcgc ccggggtgca ggtgtcttcc 480 gatgccgcgg cggcccgctcggcggcattg cgccgcgttg aacttgcccg cccacgttcg 540 gtggttggca agaacaccgtcgaatgcatg caagccggtg cggtgttcgg cttcgccggg 600 ctggtagacg ggttggtaggccgcatccgc gaggacgtgt ccggtttctc cgtcgaccac 660 gatgtcgcga tcgtggctaccgggcatacc gcgcccctgc tgctgccgga attgcacacc 720 gtcgaccatt acgaccagcacctgaccttg cagggtctgc ggctggtgtt cgagcgtaac 780 ctcgaagtcc agcgcggccggctcaagacg gcgcgctga 819 26 777 DNA Rhodobacter capsulatus 26 atgcttttgtgcatcgactg cggcaacacc aacaccgtgt tttcggtctg ggacgggacg 60 gatttcgccgccacctggcg catcgccacc gatcatcgcc gcaccgccga cgaatatttc 120 gtctggctgaacacgctgat gcaactgaag ggcctgcagg gccggatctc cgaggcgatc 180 atctcctcgaccgcgccgcg ggtggtgttc aacctgcgcg ttctgtgcaa ccgctatttc 240 gactgccgcccctatgtcgt cggcaaaccg ggctgcgagc tgccggtggc gccgcgcgtc 300 gatccgggcaccacggtcgg gccggaccgg ctggtcaata cggtggcggg ctatgaccgt 360 catggcggcgatctgatcgt cgtcgatttc ggcaccgcca ccacctttga cgtggtggcc 420 cccgatggcgcctatatcgg cggggtgatc gcgcccgggg tgaacctgag ccttgaggcg 480 ctgcatatggcggcggccgc gctgccgcat gtcgacgtca cgaaaccgca aggggtgatc 540 ggcacgaatacggtggcctg catccaatcc ggggtgtatt ggggctatat cggccttgtc 600 gaaggcatcgtgcggcagat ccggatggaa cgtgaccgtc cgatgaaggt gattgccacc 660 gggggtcttgcctcgctctt cgatctgggt ttcgatctgt tcgacaaggt cgaggatgac 720 ctgaccatgcatggtctgcg tctgatcttc gattacaaca agggacttgg ggcgtga 777 27 768 DNAGeobacter sulfurreducens 27 gtgcttcttg ttatagacgt gggtaatacc aatatcgtgctcgggattta cgatggcgag 60 cgcctggtga gggattggcg ggtctccacg gacaaggcccgtactaccga cgagtacggt 120 attctcataa atgagttgtt ccgcttggcg ggccttgggctcgatcagat ccgcgcggtg 180 atcatctcct cggtggtgcc gcccctcacc ggcgtgctggagcgtctttc cctggggtat 240 ttcgggatgc gtcccctggt ggtgggaccg ggcatcaagacaggcatgcc aatccagtac 300 gacaaccccc gggaggtggg ggccgaccgg atcgtgaacgcggtggcggg gtacgagaag 360 taccgcacct ctctcattat cgtcgatttc ggcaccgctaccacgttcga ctacgtgaac 420 cgcaagggag agtactgcgg aggggccatc gcgccgggactcgtcatttc caccgaggcc 480 ctgttccagc gggccagcaa gctgcccagg gttgatatcatacgtccgtc cgcgatcatt 540 gccaggaaca cggtcaattc gatgcaggcg ggaatttactatggttacgt ggggctcgta 600 gacgagatcg tcacccggat gaaggccgag agcaaggatgcgccccgggt tatcgctacc 660 ggagggttgg cgtccctcat agcgccggag tccaagaccatcgaagccgt cgaggaatat 720 ctgacactgg agggattgcg catactgtac gaacgaaacagggagtga 768 28 789 DNA Deinococcus radiodurans 28 gtgcccgctt ttcccctgctcgccgtggac atcggcaaca ccaccaccgt cctgggtctg 60 gccgacgcct cgggcgccctgacccacacc tggcggattc ggaccaaccg cgagatgctg 120 cccgacgacc tcgcgctgcaactgcacggg ctctttaccc tcgccggggc gccgattccc 180 cgcgccgccg tgctgagcagcgtggcgccc ccggtgggcg aaaactacgc gctcgcgctc 240 aagcggcact tcatgatcgacgcttttgcc gtgagtgccg agaacctgcc cgacgtgacg 300 gtggaactcg acacgccgggctcggtgggt gcggaccgcc tgtgcaacct cttcggcgcc 360 gaaaagtacc tgggggggctggactacgcg gtggtagtgg atttcgggac ctccaccaac 420 tttgacgtgg tggggcgggggcggcgtttc ctcggcggca tcctcgccac cggagcgcag 480 gtcagcgccg acgccctgttcgcccgcgcc gccaaactgc cgcgcatcac cctgcaagcg 540 cccgagacgg ccatcggcaaaaacaccgtc cacgcgctgc aatcgggcct ggtcttcggc 600 tacgccgaga tggtggacggcctgctgcgc cgcatccgcg ccgagttgcc gggcgaagcg 660 gtcgccgtcg ccactggcggcttctcgcgc accgtgcagg ggatttgcca ggaaatcgac 720 tactacgacg aaacgctgacgttgcgcggg ttggtggagc tgtgggcgag ccgttcggag 780 gtccgctga 789 29 741 DNAThermotoga maritima 29 ttgtacctcc tcgtggacgt gggtaacacg cattctgtcttctctatcac cgaagatggt 60 aaaactttca gaaggtggag gctgtccacc ggtgtgtttcagacggaaga cgaactcttt 120 tcacaccttc atcctcttct gggcgatgct atgcgtgagataaaggggat aggagtggcc 180 tccgtcgttc ccactcagaa cacagtcata gagcgtttttctcaaaagta tttccacata 240 tcaccgatat gggtgaaggc gaaaaacgga tgtgtgaaatggaacgtgaa gaatccctcg 300 gaagtgggtg ctgatagggt ggccaacgtt gtcgctttcgtcaaggaata cggtaaaaac 360 ggaatcatca tcgacatggg aacggcaacc accgtggatcttgttgtgaa cggatcttac 420 gaaggaggag ccattttgcc tggattcttc atgatggttcactcgctctt tcggggaacg 480 gcaaaacttc cgctcgttga ggtaaaacca gcggattttgttgtaggaaa ggatacggag 540 gaaaacatca ggctgggtgt ggtgaacgga agtgtctacgctcttgaggg gataataggg 600 cgaataaagg aagtttacgg tgatttaccg gtggttctcacgggaggtca gtcgaagatc 660 gtgaaagata tgataaaaca cgagattttc gatgaggacctcacgatcaa gggggtgtac 720 catttctgct tcggagattg a 741 30 822 DNATreponema pallidum 30 atgcttttga tagacgtagg gaactcgcac gtagtgttcggaatccaagg cgagaatggt 60 ggccgtgtgt gcgtgcgtga gttgtttcgc cttgcgcctgacgcgcgtaa aacccaagat 120 gagtactcgc ttctcatcca tgcgctttgc gaacgtgcgggggtcggccg tgcttctctc 180 cgtgatgcgt ttatttcctc cgtcgtgcct gtgttgacaaagaccattgc agatgcggtc 240 gctcagatta gcggcgtcca gccggttgtc tttggcccgtgggcgtacga gcacttgccg 300 gtgcgcatac cagagccagt gcgcgcggaa attggcactgacttggtagc caacgcggtg 360 gcggcctatg tgcatttccg ttctgcttgc gtggtagtggattgtggaac agcgctcacc 420 tttacggcgg tggatggcac ggggttgatt caaggggtggcaattgcgcc tggtctgcgc 480 actgcggtgc agtctctcca tacaggaacg gcacaattaccacttgttcc tcttgccctg 540 cctgattccg ttctgggcaa ggatactacg catgcggtgcaggcgggtgt ggtgcggggc 600 acgctctttg ttattcgcgc tatgattgca cagtgtcagaaagagttagg gtgccgctgt 660 gcagcggtga taacgggggg gctttcgcgt cttttctcgtcagaggtgga ctttcctcct 720 atcgatgcac agctgacgct ctcaggtctt gcacatattgcgcggctggt gccgacatct 780 ctcctgccac ctgctacagt gtcaggttca tcggggaatt ga822 31 789 DNA Borrelia burgdorferi 31 atgaataaac ctttattatc agaattgataattgatattg gaaataccag cattgctttt 60 gccttattta aagataatca agttaatttatttattaaaa tgaaaacaaa tcttatgtta 120 aggtatgatg aggtttatag cttttttgaagaaaattttg attttaatgt aaataaagtt 180 tttataagca gcgttgttcc tattcttaatgaaacattta aaaatgtcat tttttctttt 240 tttaagataa agcctttgtt tattggttttgatttgaatt atgatttgac atttaatcct 300 tacaaaagcg ataaattttt gctaggttcagacgtttttg ccaatcttgt tgcagccatt 360 gaaaattatt catttgaaaa tgttttagtagtagaccttg gaactgcttg caccattttt 420 gctgttagca ggcaagatgg aatactcggtggtattataa attctggtcc tttgataaat 480 tttaattctt tattagataa tgcctatcttatcaaaaaat tccccattag cactccaaat 540 aatcttttag agagaacgac atctgggagtgtaaacagcg gtttatttta tcaatataag 600 tatttaatag aaggtgttta tcgtgatattaagcagatgt ataaaaaaaa atttaattta 660 ataattactg ggggtaatgc ggacctaattttgtcattaa ttgagataga gtttattttt 720 aatattcatt taactgtaga aggcgttagaattttaggaa attctattga ctttaagttt 780 gttaattga 789 32 690 DNA Aquifexaeolicus 32 atgaggtttt tgacggtaga cgtagggaat tcctccgttg atatcgccctatgggaaggg 60 aagaaagtaa aagattttct gaaactttca cacgaagaat ttttaaaggaagaatttcct 120 aaattaaaag cgctcggaat atccgtaaaa cagagtttta gcgaaaaagtaaggggaaaa 180 ataccgaaga taaagttttt aaagaaggaa aactttccta tacaggttgattacaaaact 240 cctgaaacgc tgggcacgga cagggtagca cttgcttact ccgccaaaaagttttacgga 300 aagaatgttg tagtaatcag tgcgggtact gcccttgtaa ttgacctagttcttgagggc 360 aaatttaagg gagggtttat taccttagga cttggaaaga agttaaaaattctttccgac 420 ctggcggagg gaattcccga gttttttccc gaagaggtag aaatttttcttgggcgttct 480 acacgagagt gcgtcctggg aggggcttac agggagagca cagaatttattaaaagtaca 540 ctgaaactct ggagaaaagt atttaaaaga aagttcaaag tggttataacgggcggagag 600 gggaagtact tttccaagtt cggtatttac gacccactcc ttgttcacaggggcatgaga 660 aatttacttt acctctatca caggatttaa 690 33 774 DNASynechocystis sp. 33 gtggaaacat caaagccggg ttgtggttta gccctggataatgacaagca aaaaccttgg 60 ttaggcctaa tgataggcaa ctcccgtctg cactgggcatattgtagcgg caatgctccc 120 ctgcaaacct gggttacaga ttacaacccc aagtcagctcagttgccggt tttgttgggg 180 aaagttcctc tgatgttggc atcggtggta ccggaacaaaccgaagtttg gcgagtatat 240 cagcctaaaa ttttgaccct gaagaatctt cccctggtcaatctttaccc cagctttggc 300 attgaccggg ccctggctgg tttagggacg gggctgacctacggctttcc ctgtctagtg 360 gttgatggag gcactgcttt gaccattaca gggtttgaccaagataaaaa actggtgggg 420 ggagcgatct tgcccggttt gggattgcag ttagcaacccttggcgatcg cctggcggcc 480 ctaccgaagt tagaaatgga tcaattaacc gagttgcctgaccgttgggc tttagatacc 540 cccagcgcca tttttagtgg tgttgtctat ggcgtgttgggggcattgca gagttatctc 600 caggattggc aaaagctttt tcctggtgcc gccatggttatcaccggggg agacggcaag 660 atattacatg gcttcctaaa agagcattct cctaatctttcggtggcctg ggatgacaat 720 ttgatcttcc tcggtatggc ggccatacac cacggcgatcgccccatctg ttag 774 34 672 DNA Helicobacter pylori 34 atgccagctaggcaatcttt caaggattta aaagacttga ttttatgcga tataggcaac 60 acacgcatccatttcgcgca aaactaccag ctcttttcaa gcgctaaaga agatttaaag 120 cgtttgggtattcaaaagga aattttttac attagtgtga atgaagaaaa tgaaaaagct 180 cttttaaattgttaccctaa cgctaaaaat atcgcagggt tttttcattt agaaaccgac 240 tatatagggcttgggataga ccggcaaatg gcatgtttag cggtggttaa tggggttata 300 gtggatgctgggagcgcgat tacgattgat ttagtcaaag agggcaagca tttaggaggg 360 tgtattttgcccggtttagc ccaatatgtc catgcgtata aaaaaagcgc gaaaatctta 420 gagcaacctttcaaagcctt agattcttta gaagttttac ccaaaaacac cagagacgct 480 gtgaattacggcatgatttt gagtatcatc tcttgtatcc aacatttagc taaagatcaa 540 aaaatctatctttgtggggg cgatgcgaaa tatttgagcg cgtttttacc tcattctgtt 600 tgcaaggagcgtttggtttt tgacgggatg gaaatcgctc ttaaaaaagc agggatacta 660 gaatgcaaat ga672 35 747 DNA Pseudomonas aeruginosa 35 atgattcttg agctcgactgtggaaactcg ctgatcaagt ggcgggtcat cgagggggcg 60 gcgcggtcgg tcgccggtggccttgcggag tccgatgatg ccctggtcga acagttaacg 120 tcgcagcaag cgctgccagtgcgagcctgt cgcctggtga gcgttcgcag cgagcaggaa 180 acctcgcaac tggtcgcacggttggagcag ctgttcccgg tttcggcgct ggttgcatca 240 tccggcaagc agttggcgggtgtgcgcaac ggctatctcg attaccagcg cctggggctc 300 gaccgctggc tggccctcgtcgcggctcat cacctggcta agaaggcctg cctggtcatt 360 gatctgggga ccgcggtcacctctgacctg gtcgcggcgg atggagtgca tctggggggc 420 tacatatgcc cgggcatgaccctgatgaga agccagttgc gcacccatac ccgacgtatc 480 cgctacgacg atgcagaggcccggcgggcg cttgccagtc tccagccagg gcaggccacg 540 gccgaggcgg ttgagcggggttgtctgctc atgctcaggg ggttcgttcg tgagcagtac 600 gccatggcgt gcgagctgctcggtccggat tgtgaaatat tcctgacggg tggggatgcc 660 gaactggttc gcgacgaactggctggcgcc cggatcatgc cggacctggt tttcgtaggg 720 ctggcactgg cttgcccgattgagtga 747 36 630 DNA Campylobacter jejuni 36 atgttgctct gtgatattgggaattcaaat gctaatttcc tagatgataa caaatatttt 60 actcttaata tagatcagtttttagaattt aaaaatgaac aaaaaatttt ttatatcaat 120 gtcaatgaac atctcaaagaacatttaaaa aatcaaaaaa attttatcaa tcttgaacct 180 tattttttat ttgatacaatttatcaagga ttaggaatcg atcgcatagc agcttgttat 240 actattgaag atggagttgttgtagatgca ggtagtgcta ttacaattga tattatttct 300 aattctatac atcttggtggttttatcttg ccaggtattg caaattataa aaaaatttat 360 agccatattt caccacgattaaaaagtgaa tttaacactc aagttagtct tgatgcattc 420 ccacaaaaaa ccatggatgctttaagttat ggtgttttta aaggaattta cctactgata 480 aaagatgccg ctcaaaataaaaagctttat ttcactggtg gagatgggca atttttagca 540 aattatttcg atcacgcaatttatgataaa cttttaatct ttcgaggaat gaaaaagatt 600 ataaaagaaa atcccaatttactttattaa 630 37 1779 DNA Neisseria meningitidis 37 atgacggttttgaagccttc gcactggcgg gtgttggcgg agcttgccga cggtttgccg 60 caacacgtctcgcaactggc gcgtatggcg gatatgaagc cgcagcagct caacggtttt 120 tggcagcagatgccggcgca catacgcggg ctgttgcgcc aacacgacgg ctattggcgg 180 ctggtgcgcccattggcggt tttcgatgcc gaaggtttgc gcgagctggg ggaaaggtcg 240 ggttttcagacggcattgaa gcacgagtgc gcgtccagca acgacgagat actggaattg 300 gcgcggattgcgccggacaa ggcgcacaaa accatatgtg tgacccacct gcaaagtaag 360 ggcagggggcggcaggggcg gaagtggtcg caccgtttgg gcgagtgtct gatgttcagt 420 tttggctgggtgtttgaccg gccgcagtat gagttgggtt cgctgtcgcc tgttgcggca 480 gtggcgtgccggcgcgcctt gtcgcgtttg ggtttgaaaa cgcaaatcaa gtggccaaac 540 gatttggtcgtcggacgcga caaattgggc ggcattctga ttgaaacggt caggacgggc 600 ggcaaaacggttgccgtggt cggtatcggc atcaatttcg tgctgcccaa ggaagtggaa 660 aacgccgcttccgtgcaatc gctgtttcag acggcatcgc ggcggggaaa tgccgatgcc 720 gccgtgttgctggaaacgct gttggcggaa cttgatgcgg tgttgttgca atatgcgcgg 780 gacggatttgcgccttttgt ggcggaatat caggctgcca accgcgacca cggcaaggcg 840 gtattgctgttgcgcgacgg cgaaaccgtg ttcgaaggca cggttaaagg cgtggacgga 900 caaggcgttctgcacttgga aacggcagag ggcaaacaga cggtcgtcag cggcgaaatc 960 agcctgcggtccgacgacag gccggtttcc gtgccgaagc ggcgggattc ggaacgtttt 1020 ctgctgttggacggcggcaa cagccggctc aagtgggcgt gggtggaaaa cggcacgttc 1080 gcaaccgtcggtagcgcgcc gtaccgcgat ttgtcgcctt tgggcgcgga gtgggcggaa 1140 aaggtggatggaaatgtccg catcgtcggt tgcgccgtgt gcggagaatt caaaaaggca 1200 caagtgcaggaacagctcgc ccgaaaaatc gagtggctgc cgtcttccgc acaggctttg 1260 ggcatacgcaaccactaccg ccaccccgaa gaacacggtt ccgaccgctg gttcaacgcc 1320 ttgggcagccgccgcttcag ccgcaacgcc tgcgtcgtcg tcagttgcgg cacggcggta 1380 acggttgacgcgctcaccga tgacggacat tatctcgggg gaaccatcat gcccggtttc 1440 cacctgatgaaagaatcgct cgccgtccga accgccaacc tcaaccggca cgccggtaag 1500 cgttatcctttcccgaccac aacgggcaat gccgtcgcca gcggcatgat ggatgcggtt 1560 tgcggctcggttatgatgat gcacgggcgt ttgaaagaaa aaaccggggc gggcaagcct 1620 gtcgatgtcatcattaccgg cggcggcgcg gcaaaagttg ccgaagccct gccgcctgca 1680 tttttggcggaaaataccgt gcgcgtggcg gacaacctcg tcattcacgg gctgctgaac 1740 ctgattgccgccgaaggcgg ggaatcggaa catacttaa 1779 38 804 DNA Bordetella pertussis 38atgattatcc tcatcgactc cggcaacagc cgcctcaaag tcggctggtt tgacccggac 60gcgccgcagg cggcgcgcga gcccgccccc gtcgccttcg acaatctcga cctggacgcg 120ctgggccgct ggctggccac cctgcccagg cgcccgcaac gggcgctggg cgtgaacgtc 180gccgggcttg cccgcggcga agccattgcc gccacgctgc gcgcgggcgg ttgcgacatc 240cggtggctgc gggcccagcc cctggccatg gggctgcgca acggctatcg caatcccgac 300caactgggcg ccgaccgctg ggcgtgcatg gtgggcgtgc tggcgcgcca gccgtccgtg 360cacccgccgc tgctggtggc cagtttcggc acggccacca cgctggacac catcgggccc 420gacaatgtct ttcccggcgg gctgatcctg cccggccccg ccatgatgcg cggcgcgctg 480gcctacggca ccgcccacct gcccctggcc gacggcctgg tggccgacta ccccatcgac 540acccatcagg ccatcgccag cggcatcgcc gccgcccagg ccggcgcgat cgtgcggcaa 600tggctggccg gccgccaacg ctacggccag gcgccggaga tctatgtcgc cggcggcggg 660tggcccgaag tgcggcagga agccgagcgc ctgctggcgg tcaccggcgc cgccttcggc 720gccacgccgc agcccactta cctcgacagc cccgtgctcg acggcctggc ggcgctcgcc 780gcgcaaggcg cgccaacggc ctga 804 39 460 PRT Neisseria gonorrhoeae 39 MetGly Glu Cys Leu Met Phe Ser Phe Gly Trp Ala Phe Asp Arg Pro 1 5 10 15Gln Tyr Glu Leu Gly Ser Leu Ser Pro Val Ala Ala Leu Ala Cys Arg 20 25 30Arg Ala Leu Gly Cys Leu Gly Leu Glu Thr Gln Ile Lys Trp Pro Asn 35 40 45Asp Leu Val Val Gly Arg Asp Lys Leu Gly Gly Ile Leu Ile Glu Thr 50 55 60Val Arg Ala Gly Gly Lys Thr Val Ala Val Val Gly Ile Gly Ile Asn 65 70 7580 Phe Val Leu Pro Lys Glu Val Glu Asn Ala Ala Ser Val Gln Ser Leu 85 9095 Phe Gln Thr Ala Ser Arg Arg Gly Asn Ala Asp Ala Ala Val Leu Leu 100105 110 Glu Thr Leu Leu Ala Glu Leu Gly Ala Val Leu Glu Gln Tyr Ala Glu115 120 125 Glu Gly Phe Ala Pro Phe Leu Asn Glu Tyr Glu Thr Ala Asn ArgAsp 130 135 140 His Gly Lys Ala Val Leu Leu Leu Arg Asp Gly Glu Thr ValCys Glu 145 150 155 160 Gly Thr Val Lys Gly Val Asp Gly Arg Gly Val LeuHis Leu Glu Thr 165 170 175 Ala Glu Gly Glu Gln Thr Val Val Ser Gly GluIle Ser Leu Arg Pro 180 185 190 Asp Asn Arg Ser Val Ser Val Pro Lys ArgPro Asp Ser Glu Arg Phe 195 200 205 Leu Leu Leu Glu Gly Gly Asn Ser ArgLeu Lys Trp Ala Trp Val Glu 210 215 220 Asn Gly Thr Phe Ala Thr Val GlySer Ala Pro Tyr Arg Asp Leu Ser 225 230 235 240 Pro Leu Gly Ala Glu TrpAla Glu Lys Ala Asp Gly Asn Val Arg Ile 245 250 255 Val Gly Cys Ala ValCys Gly Glu Ser Lys Lys Ala Gln Val Lys Glu 260 265 270 Gln Leu Ala ArgLys Ile Glu Trp Leu Pro Ser Ser Ala Gln Ala Leu 275 280 285 Gly Ile ArgAsn His Tyr Arg His Pro Glu Glu His Gly Ser Asp Arg 290 295 300 Trp PheAsn Ala Leu Gly Ser Arg Arg Phe Ser Arg Asn Ala Cys Val 305 310 315 320Val Val Ser Cys Gly Thr Ala Val Thr Val Asp Ala Leu Thr Asp Asp 325 330335 Gly His Tyr Leu Gly Gly Thr Ile Met Pro Gly Phe His Leu Met Lys 340345 350 Glu Ser Leu Ala Val Arg Thr Ala Asn Leu Asn Arg Pro Ala Gly Lys355 360 365 Arg Tyr Pro Phe Pro Thr Thr Thr Gly Asn Ala Val Ala Ser GlyMet 370 375 380 Met Asp Ala Val Cys Gly Ser Ile Met Met Met His Gly ArgLeu Lys 385 390 395 400 Glu Lys Asn Gly Ala Gly Lys Pro Val Asp Val IleIle Thr Gly Gly 405 410 415 Gly Ala Ala Lys Val Ala Glu Ala Leu Pro ProAla Phe Leu Ala Glu 420 425 430 Asn Thr Val Arg Val Ala Asp Asn Leu ValIle His Gly Leu Leu Asn 435 440 445 Leu Ile Ala Ala Glu Gly Gly Glu SerGlu His Ala 450 455 460 40 1383 DNA Neisseria gonorrhoeae 40 ttgggcgagtgcctgatgtt cagtttcgga tgggcgtttg accgcccgca gtatgagttg 60 ggttcgctgtcgcctgttgc ggcacttgcg tgccggcgcg ctttggggtg tttgggtttg 120 gaaacgcaaatcaagtggcc aaacgatttg gtcgtcggac gcgacaaatt gggcggcatt 180 ctgattgaaacagtcagggc gggcggtaaa acggttgccg tggtcggtat cggcatcaat 240 ttcgtgctgcccaaggaagt ggaaaacgcc gcttccgtgc agtcgctgtt tcagacggca 300 tcgcggcggggcaatgccga tgccgccgta ttgctggaaa cattgcttgc ggaactgggc 360 gcggtgttggaacaatatgc ggaagaaggg ttcgcgccat ttttaaatga gtatgaaacg 420 gccaaccgcgaccacggcaa ggcggtattg ctgttgcgcg acggcgaaac cgtgtgcgaa 480 ggcacggttaaaggcgtgga cggacgaggc gttctgcact tggaaacggc agaaggcgaa 540 cagacggtcgtcagcggcga aatcagcctg cggcccgaca acaggtcggt ttccgtgccg 600 aagcggccggattcggaacg ttttttgctg ttggaaggcg ggaacagccg gctcaagtgg 660 gcgtgggtggaaaacggcac gttcgcaacc gtgggcagcg cgccgtaccg cgatttgtcg 720 cctttgggcgcggagtgggc ggaaaaggcg gatggaaatg tccgcatcgt cggttgcgcc 780 gtgtgcggagaatccaaaaa ggcacaagtg aaggaacagc tcgcccgaaa aatcgagtgg 840 ctgccgtcttccgcacaggc tttgggcata cgcaaccact accgccaccc cgaagaacac 900 ggttccgaccgttggttcaa cgccttgggc agccgccgct tcagccgcaa cgcctgcgtc 960 gtcgtcagttgcggcacggc ggtaacggtt gacgcgctca ccgatgacgg acattatctc 1020 ggcggaaccatcatgcccgg cttccacctg atgaaagaat cgctcgccgt ccgaaccgcc 1080 aacctcaaccgccccgccgg caaacgttac cctttcccga ccacaacggg caacgccgtc 1140 gcaagcggcatgatggacgc ggtttgcggc tcgataatga tgatgcacgg ccgtttgaaa 1200 gaaaaaaacggcgcgggcaa gcctgtcgat gtcatcatta ccggcggcgg cgcggcgaaa 1260 gtcgccgaagccctgccgcc tgcatttttg gcggaaaata ccgtgcgcgt ggcggacaac 1320 ctcgtcatccacgggctgct gaacctgatt gccgccgaag gcggggaatc ggaacacgct 1380 taa 1383 41244 PRT Porphyromonas gingivalis 41 Met Ser Phe Asn Leu Ile Val Asp GlnGly Asn Ser Ala Cys Lys Val 1 5 10 15 Ala Phe Val Arg Asn Asn Ser IleGlu Ser Ile Ser Phe Leu Pro Gly 20 25 30 Lys Ala Gly Gln Ala Leu Ser HisLeu Val Ala Pro His Arg Phe Asp 35 40 45 Lys Ala Ile Tyr Ser Ser Val GlyLeu Pro Asp Glu Glu Ala Glu Ala 50 55 60 Ile Val Arg Ser Cys Ala Ala AlaSer Leu Met Met Gly Thr Glu Thr 65 70 75 80 Pro Val Pro Leu Arg Leu GlnTyr Asp Arg Arg Thr Leu Gly Ala Asp 85 90 95 Arg Leu Ala Ala Val Val GlyAla His Ser Leu Tyr Pro Asn Thr Glu 100 105 110 Leu Leu Val Ile Asp AlaGly Thr Ala Ile Thr Tyr Glu Arg Val Ser 115 120 125 Ala Glu Gly Ile TyrLeu Gly Gly Asn Ile Ser Pro Gly Leu His Leu 130 135 140 Arg Phe Lys AlaLeu His Leu Phe Thr Gly Arg Leu Pro Leu Ile Asp 145 150 155 160 Pro SerGly Ile Ser Pro Lys Ile Ala Glu Tyr Gly Ser Ser Thr Glu 165 170 175 GluAla Ile Thr Ala Gly Val Ile His Gly Leu Ala Gly Glu Ile Asp 180 185 190Arg Tyr Ile Asp Asp Leu His Ala Lys Glu Gly Arg Ser Ala Val Ile 195 200205 Leu Thr Gly Gly Asp Ala Asn Tyr Leu Ala Arg Ile Ile Arg Ser Gly 210215 220 Ile Leu Ile His Pro Asp Leu Val Leu Leu Gly Leu Asn Arg Ile Leu225 230 235 240 Glu Tyr Asn Val 42 735 DNA Porphyromonas gingivalis 42atgtccttca atctgatcgt cgatcaaggc aattctgcct gtaaggttgc tttcgtccga 60aataatagta tagagagcat ttcctttctg ccgggaaaag ccggacaggc actcagccat 120ctcgtcgctc ctcaccgttt cgacaaggct atctactcat ctgtggggct tcccgacgaa 180gaggctgaag ctattgtgag aagttgtgca gctgcttcct tgatgatggg gactgagacc 240cccgtacccc ttcgcctgca atatgatcgc cgcactttgg gtgccgaccg actggctgcg 300gtggtcggag cgcatagtct ctatccgaat accgaattgc tggtgatcga cgccggtact 360gccatcactt atgaacgagt atccgctgaa gggatctatc tcggtggcaa tatatcgccc 420ggtctccact tgcgcttcaa ggctcttcat ctctttacgg gcaggctccc cctgattgat 480ccttccggta tctctccgaa aatagccgag tatggctcct cgaccgaaga agcgatcaca 540gccggagtaa ttcatggcct ggcaggggag atagacagat atattgacga tctgcacgct 600aaagaggggc ggtctgccgt tatactgacc ggaggagatg ccaactattt ggcacggatt 660ataagaagcg gaatactaat tcatcccgat ttagtacttt tgggcctaaa tagaatttta 720gaatataatg tataa 735 43 592 PRT Neisseria meningitidis 43 Met Thr ValLeu Lys Leu Ser His Trp Arg Val Leu Ala Glu Leu Ala 1 5 10 15 Asp GlyLeu Pro Gln His Val Ser Gln Leu Ala Arg Met Ala Asp Met 20 25 30 Lys ProGln Gln Leu Asn Gly Phe Trp Gln Gln Met Pro Ala His Ile 35 40 45 Arg GlyLeu Leu Arg Gln His Asp Gly Tyr Trp Arg Leu Val Arg Pro 50 55 60 Leu AlaVal Phe Asp Ala Glu Gly Leu Arg Glu Leu Gly Glu Arg Ser 65 70 75 80 GlyPhe Gln Thr Ala Leu Lys His Glu Cys Ala Ser Ser Asn Asp Glu 85 90 95 IleLeu Glu Leu Ala Arg Ile Ala Pro Asp Lys Ala His Lys Thr Ile 100 105 110Cys Val Thr His Leu Gln Ser Lys Gly Arg Gly Arg Gln Gly Arg Lys 115 120125 Trp Ser His Arg Leu Gly Glu Cys Leu Met Phe Ser Phe Gly Trp Val 130135 140 Phe Asp Arg Pro Gln Tyr Glu Leu Gly Ser Leu Ser Pro Val Ala Ala145 150 155 160 Val Ala Cys Arg Arg Ala Leu Ser Arg Leu Gly Leu Asp ValGln Ile 165 170 175 Lys Trp Pro Asn Asp Leu Val Val Gly Arg Asp Lys LeuGly Gly Ile 180 185 190 Leu Ile Glu Thr Val Arg Thr Gly Gly Lys Thr ValAla Val Val Gly 195 200 205 Ile Gly Ile Asn Phe Val Leu Pro Lys Glu ValGlu Asn Ala Ala Ser 210 215 220 Val Gln Ser Leu Phe Gln Thr Ala Ser ArgArg Gly Asn Ala Asp Ala 225 230 235 240 Ala Val Leu Leu Glu Thr Leu LeuVal Glu Leu Asp Ala Val Leu Leu 245 250 255 Gln Tyr Ala Arg Asp Gly PheAla Pro Phe Val Ala Glu Tyr Gln Ala 260 265 270 Ala Asn Arg Asp His GlyLys Ala Val Leu Leu Leu Arg Asp Gly Glu 275 280 285 Thr Val Phe Glu GlyThr Val Lys Gly Val Asp Gly Gln Gly Val Leu 290 295 300 His Leu Glu ThrAla Glu Gly Lys Gln Thr Val Val Ser Gly Glu Ile 305 310 315 320 Ser LeuArg Ser Asp Asp Arg Pro Val Ser Val Pro Lys Arg Arg Asp 325 330 335 SerGlu Arg Phe Leu Leu Leu Asp Gly Gly Asn Ser Arg Leu Lys Trp 340 345 350Ala Trp Val Glu Asn Gly Thr Phe Ala Thr Val Gly Ser Ala Pro Tyr 355 360365 Arg Asp Leu Ser Pro Leu Gly Ala Glu Trp Ala Glu Lys Ala Asp Gly 370375 380 Asn Val Arg Ile Val Gly Cys Ala Val Cys Gly Glu Phe Lys Lys Ala385 390 395 400 Gln Val Gln Glu Gln Leu Ala Arg Lys Ile Glu Trp Leu ProSer Ser 405 410 415 Ala Gln Ala Leu Gly Ile Arg Asn His Tyr Arg His ProGlu Glu His 420 425 430 Gly Ser Asp Arg Trp Phe Asn Ala Leu Gly Ser ArgArg Phe Ser Arg 435 440 445 Asn Ala Cys Val Val Val Ser Cys Gly Thr AlaVal Thr Val Asp Ala 450 455 460 Leu Thr Asp Asp Gly His Tyr Leu Gly GlyThr Ile Met Pro Gly Phe 465 470 475 480 His Leu Met Lys Glu Ser Leu AlaVal Arg Thr Ala Asn Leu Asn Arg 485 490 495 His Ala Gly Lys Arg Tyr ProPhe Pro Thr Thr Thr Gly Asn Ala Val 500 505 510 Ala Ser Gly Met Met AspAla Val Cys Gly Ser Val Met Met Met His 515 520 525 Gly Arg Leu Lys GluLys Thr Gly Ala Gly Lys Pro Val Asp Val Ile 530 535 540 Ile Thr Gly GlyGly Ala Ala Lys Val Ala Glu Ala Leu Pro Pro Ala 545 550 555 560 Phe LeuAla Glu Asn Thr Val Arg Val Ala Asp Asn Leu Val Ile Tyr 565 570 575 GlyLeu Leu Asn Met Ile Ala Ala Glu Gly Arg Glu Tyr Glu His Ile 580 585 59044 1779 DNA Neisseria meningitidis 44 atgacggttt tgaagctttc gcactggcgggtgttggcgg agcttgccga cggtttgccg 60 caacacgtct cgcaactggc gcgtatggcggatatgaagc cgcagcagct caacggtttt 120 tggcagcaga tgccggcgca catacgcgggctgttgcgcc aacacgacgg ctattggcgg 180 ctggtgcgcc cattggcggt tttcgatgccgaaggtttgc gcgagctggg ggaaaggtcg 240 ggttttcaga cggcattgaa gcacgagtgcgcgtccagca acgacgagat actggaattg 300 gcgcggattg cgccggacaa ggcgcacaaaaccatatgcg tgacccacct gcaaagtaag 360 ggcagggggc ggcaggggcg gaagtggtcgcaccgtttgg gcgagtgtct gatgttcagt 420 tttggctggg tgtttgaccg gccgcagtatgagttgggtt cgctgtcgcc tgttgcggca 480 gtggcgtgtc ggcgcgcctt gtcgcgtttaggtttggatg tgcagattaa gtggcccaat 540 gatttggttg tcggacgcga caaattgggcggcattctga ttgaaacggt caggacgggc 600 ggcaaaacgg ttgccgtggt cggtatcggcatcaattttg tcctgcccaa ggaagtagaa 660 aatgccgctt ccgtgcaatc gctgtttcagacggcatcgc ggcggggcaa tgccgatgcc 720 gccgtgctgc tggaaacgct gttggtggaactggacgcgg tgttgttgca atatgcgcgg 780 gacggatttg cgccttttgt ggcggaatatcaggctgcca accgcgacca cggcaaggcg 840 gtattgctgt tgcgcgacgg cgaaaccgtgttcgaaggca cggttaaagg cgtggacgga 900 caaggcgttt tgcacttgga aacggcagagggcaaacaga cggtcgtcag cggcgaaatc 960 agcctgcggt ccgacgacag gccggtttccgtgccgaagc ggcgggattc ggaacgtttt 1020 ctgctgttgg acggcggcaa cagccggctcaagtgggcgt gggtggaaaa cggcacgttc 1080 gcaaccgtcg gtagcgcgcc gtaccgcgatttgtcgcctt tgggcgcgga gtgggcggaa 1140 aaggcggatg gaaatgtccg catcgtcggttgcgctgtgt gcggagaatt caaaaaggca 1200 caagtgcagg aacagctcgc ccgaaaaatcgagtggctgc cgtcttccgc acaggctttg 1260 ggcatacgca accactaccg ccaccccgaagaacacggtt ccgaccgctg gttcaacgcc 1320 ttgggcagcc gccgcttcag ccgcaacgcctgcgtcgtcg tcagttgcgg cacggcggta 1380 acggttgacg cgctcaccga tgacggacattatctcgggg gaaccatcat gcccggtttc 1440 cacctgatga aagaatcgct cgccgtccgaaccgccaacc tcaaccggca cgccggtaag 1500 cgttatcctt tcccgaccac aacgggcaatgccgtcgcca gcggcatgat ggatgcggtt 1560 tgcggctcgg ttatgatgat gcacgggcgtttgaaagaaa aaaccggggc gggcaagcct 1620 gtcgatgtca tcattaccgg cggcggcgcggcaaaagttg ccgaagccct gccgcctgca 1680 tttttggcgg aaaataccgt gcgcgtggcggacaacctcg tcatttacgg gttgttgaac 1740 atgattgccg ccgaaggcag ggaatatgaacatatttaa 1779 45 262 PRT Bacillus anthracis 45 Met Ile Phe Val Leu AspVal Gly Asn Thr Asn Ala Val Leu Gly Val 1 5 10 15 Phe Glu Glu Gly GluLeu Arg Gln His Trp Arg Met Glu Thr Asp Arg 20 25 30 His Lys Thr Glu AspGlu Tyr Gly Met Leu Val Lys Gln Leu Leu Glu 35 40 45 His Glu Gly Leu SerPhe Glu Asp Val Lys Gly Ile Ile Val Ser Ser 50 55 60 Val Val Pro Pro IleMet Phe Ala Leu Glu Arg Met Cys Glu Lys Tyr 65 70 75 80 Phe Lys Ile LysPro Leu Val Val Gly Pro Gly Ile Lys Thr Gly Leu 85 90 95 Asn Ile Lys TyrGlu Asn Pro Arg Glu Val Gly Ala Asp Arg Ile Val 100 105 110 Asn Ala ValAla Gly Ile His Leu Tyr Gly Ser Pro Leu Ile Ile Val 115 120 125 Asp PheGly Thr Ala Thr Thr Tyr Cys Tyr Ile Asn Glu Glu Lys His 130 135 140 TyrMet Gly Gly Val Ile Thr Pro Gly Ile Met Ile Ser Ala Glu Ala 145 150 155160 Leu Tyr Ser Arg Ala Ala Lys Leu Pro Arg Ile Glu Ile Thr Lys Pro 165170 175 Ser Ser Val Val Gly Lys Asn Thr Val Ser Ala Met Gln Ser Gly Ile180 185 190 Leu Tyr Gly Tyr Val Gly Gln Val Glu Gly Ile Val Lys Arg MetLys 195 200 205 Glu Glu Ala Lys Gln Glu Pro Lys Val Ile Ala Thr Gly GlyLeu Ala 210 215 220 Lys Leu Ile Ser Glu Glu Ser Asn Val Ile Asp Val ValAsp Pro Phe 225 230 235 240 Leu Thr Leu Lys Gly Leu Tyr Met Leu Tyr GluArg Asn Ala Asn Leu 245 250 255 Gln His Glu Lys Gly Glu 260 46 789 DNABacillus anthracis 46 atgatttttg tattggatgt agggaacaca aatgctgtactgggcgtgtt tgaagagggg 60 gaacttcgtc aacattggcg catggaaaca gatcgtcataagacagaaga tgaatatgga 120 atgcttgtga agcagttgct tgagcatgag ggtctttcgtttgaagatgt gaaaggtatt 180 atcgtatctt cagtcgtgcc accaattatg tttgctttagagcgcatgtg tgaaaagtat 240 tttaaaatta agccgcttgt agtaggtcct ggaataaaaacggggctaaa tattaaatat 300 gaaaatccac gtgaagtagg tgcggatcga atcgtaaatgcagtagcagg gatccactta 360 tatggaagtc cgcttattat tgtcgatttt ggtacggctactacatattg ttatattaac 420 gaagaaaagc attatatggg tggagttatt acaccgggaattatgatttc agcagaggct 480 ttatatagta gagccgcaaa acttcctcgt attgaaattacaaaaccaag cagtgtagtt 540 gggaagaata cggtaagtgc gatgcaatct ggtattctttatggttatgt tggacaagtg 600 gaaggtattg ttaagcgcat gaaagaggaa gctaaacaagaaccgaaagt tattgcaaca 660 ggtggattgg cgaaattaat ttcagaagaa tcgaatgtgattgatgttgt agatccattt 720 ttaacattaa aaggtttgta tatgttatac gagcggaatgcaaatttaca gcatgagaaa 780 ggtgaataa 789 47 254 PRT Bacillus halodurans47 Met Ile Leu Val Ile Asp Val Gly Asn Thr Asn Thr Val Leu Gly Val 1 510 15 Tyr Gln Asp Glu Thr Leu Val His His Trp Arg Leu Ala Thr Ser Arg 2025 30 Gln Lys Thr Glu Asp Glu Tyr Ala Met Thr Val Arg Ser Leu Phe Asp 3540 45 His Ala Gly Leu Gln Phe Gln Asp Ile Asp Gly Ile Val Ile Ser Ser 5055 60 Val Val Pro Pro Met Met Phe Ser Leu Glu Gln Met Cys Lys Lys Tyr 6570 75 80 Phe His Val Thr Pro Met Ile Ile Gly Pro Gly Ile Lys Thr Gly Leu85 90 95 Asn Ile Lys Tyr Asp Asn Pro Lys Glu Val Gly Ala Asp Arg Ile Val100 105 110 Asn Ala Val Ala Ala Ile Glu Leu Tyr Gly Tyr Pro Ala Ile ValVal 115 120 125 Asp Phe Gly Thr Ala Thr Thr Tyr Cys Leu Ile Asn Glu LysLys Gln 130 135 140 Tyr Ala Gly Gly Val Ile Ala Pro Gly Ile Met Ile SerThr Glu Ala 145 150 155 160 Leu Tyr His Arg Ala Ser Lys Leu Pro Arg IleGlu Ile Ala Lys Pro 165 170 175 Lys Gln Val Val Gly Thr Asn Thr Ile AspSer Met Gln Ser Gly Ile 180 185 190 Phe Tyr Gly Tyr Val Ser Gln Val AspGly Val Val Lys Arg Met Lys 195 200 205 Ala Gln Ala Glu Ser Glu Pro LysVal Ile Ala Thr Gly Gly Leu Ala 210 215 220 Lys Leu Ile Gly Thr Glu SerGlu Thr Ile Asp Val Ile Asp Ser Phe 225 230 235 240 Leu Thr Leu Lys GlyLeu Gln Leu Ile Tyr Lys Lys Asn Val 245 250 48 765 DNA Bacillushalodurans 48 atgatacttg tcattgatgt tggaaataca aatactgtgt taggggtctaccaagatgaa 60 acgttagtgc atcattggcg gctagcgacg agtaggcaaa agaccgaggatgagtatgca 120 atgacggtgc gttctctctt tgatcatgca ggtctacagt ttcaagacatagacggcatt 180 gtcatttcat ctgttgtccc accgatgatg ttttccttag agcaaatgtgcaaaaaatac 240 tttcatgtca ctcctatgat tattgggcct ggaattaaga caggcttaaatattaagtat 300 gacaatccaa aagaggttgg ggccgatcga atcgttaatg cagttgcagcgattgagtta 360 tatggctacc ctgccattgt cgttgatttt ggaacagcaa caacatattgcttaattaat 420 gaaaaaaaac aatatgcagg gggagtcatt gctcctggaa tcatgatctcaacagaagcg 480 ttgtatcatc gcgcatcaaa attgccacgg attgaaatag cgaagccgaaacaagtcgta 540 gggacaaata cgattgattc gatgcaatca ggaatcttct acgggtatgtgagccaagtc 600 gatggtgttg tgaaacgaat gaaggctcaa gcagaaagtg aaccgaaagtcattgcaact 660 ggtgggcttg cgaagttaat cggaaccgag tcggaaacca ttgatgtaatcgattcgttt 720 ttaacattaa aaggattgca actcatttat aagaagaatg tctga 765 49258 PRT Bacillus stearothermophilus 49 Met Ile Phe Val Leu Asp Val GlyAsn Thr Asn Thr Val Leu Gly Val 1 5 10 15 Tyr Asp Gly Asp Glu Leu LysHis His Trp Arg Ile Glu Thr Ser Arg 20 25 30 Ser Lys Thr Glu Asp Glu TyrGly Met Met Ile Lys Ala Leu Leu Asn 35 40 45 His Val Gly Leu Gln Phe SerAsp Ile Arg Gly Ile Ile Ile Ser Ser 50 55 60 Val Val Pro Pro Ile Met PheAla Leu Glu Arg Met Cys Leu Lys Tyr 65 70 75 80 Phe His Ile Lys Pro LeuIle Val Gly Pro Gly Ile Lys Thr Gly Leu 85 90 95 Asp Ile Lys Tyr Asp AsnPro Arg Glu Val Gly Ala Asp Arg Ile Val 100 105 110 Asn Ala Val Ala GlyIle His Leu Tyr Gly Ser Pro Leu Ile Ile Val 115 120 125 Asp Phe Gly ThrAla Thr Thr Tyr Cys Tyr Ile Asn Glu His Lys Gln 130 135 140 Tyr Met GlyGly Ala Ile Ala Pro Gly Ile Met Ile Ser Thr Glu Ala 145 150 155 160 LeuPhe Ala Arg Ala Ala Lys Leu Pro Arg Ile Glu Ile Ala Arg Pro 165 170 175Asp Asp Ile Ile Gly Lys Asn Thr Val Ser Ala Met Gln Ala Gly Ile 180 185190 Leu Tyr Gly Tyr Val Gly Gln Val Glu Gly Ile Val Ser Arg Met Lys 195200 205 Ala Lys Ser Lys Ile Pro Pro Lys Val Ile Ala Thr Gly Gly Leu Ala210 215 220 Pro Leu Ile Ala Ser Glu Ser Asp Ile Ile Asp Val Val Asp ProPhe 225 230 235 240 Leu Thr Leu Thr Gly Leu Lys Leu Leu Tyr Glu Lys AsnThr Glu Lys 245 250 255 Lys Gly 50 777 DNA Bacillus stearothermophilus50 atgatttttg tattggacgt cggcaataca aacacggtgt taggggtgta tgacggggac 60gaactgaaac atcattggcg cattgaaaca agccgctcga aaacggaaga cgaatacggc 120atgatgatca aagcgctctt gaaccatgtc ggcttgcagt tttccgacat tcgaggcatc 180atcatttcct cggtcgtgcc gccgattatg tttgctcttg aacgcatgtg tctaaaatat 240ttccatatca aaccgctcat cgtcggtccg ggcattaaaa ccgggctcga catcaaatat 300gacaatccgc gtgaggtggg cgccgaccgg attgtcaacg cggttgccgg catccatttg 360tacggcagtc cgctgattat cgtcgatttt ggcacggcga cgacgtattg ttatattaat 420gaacataaac aatatatggg aggggccatt gccccgggaa ttatgatctc gacagaggct 480ctgtttgcgc gggcggcgaa attgccgcgc attgaaatcg cccgcccgga tgatatcatc 540ggcaaaaata cggtcagcgc catgcaagcc ggtattttat acggttatgt cggacaagtg 600gaaggcatcg tgtcgcgaat gaaggcgaaa agcaaaatcc cgccgaaggt gattgctact 660ggcggtttgg ctccgctcat tgccagcgaa tcggacatca tcgatgtcgt tgatccgttt 720ttgacgctga ctggcttaaa attgttgtac gagaaaaaca ccgagaaaaa aggatga 777 51260 PRT Caulobacter crescentus 51 Met Leu Leu Ala Ile Glu Gln Gly AsnThr Asn Thr Met Phe Ala Ile 1 5 10 15 His Asp Gly Ala Ser Trp Val AlaGln Trp Arg Ser Ala Thr Glu Ser 20 25 30 Thr Arg Thr Ala Asp Glu Tyr ValVal Trp Leu Ser Gln Leu Leu Ser 35 40 45 Met Gln Gly Leu Gly Phe Arg AlaIle Asp Ala Val Ile Ile Ser Ser 50 55 60 Val Val Pro Gln Ser Ile Phe AsnLeu Arg Asn Leu Ser Arg Arg Tyr 65 70 75 80 Phe Asn Val Glu Pro Leu ValIle Gly Glu Asn Ala Lys Leu Gly Ile 85 90 95 Asp Val Arg Ile Glu Lys ProSer Glu Ala Gly Ala Asp Arg Leu Val 100 105 110 Asn Ala Ile Gly Ala AlaMet Val Tyr Pro Gly Pro Leu Val Val Ile 115 120 125 Asp Ser Gly Thr AlaThr Thr Phe Asp Ile Val Ala Ala Asp Gly Ala 130 135 140 Phe Glu Gly GlyIle Ile Ala Pro Gly Ile Asn Leu Ser Met Gln Ala 145 150 155 160 Leu HisGlu Ala Ala Ala Lys Leu Pro Arg Ile Ala Ile Gln Arg Pro 165 170 175 AlaGly Asn Arg Ile Val Gly Thr Asp Thr Val Ser Ala Met Gln Ser 180 185 190Gly Val Phe Trp Gly Tyr Ile Ser Leu Ile Glu Gly Leu Val Ala Arg 195 200205 Ile Lys Ala Glu Arg Gly Glu Pro Met Thr Val Ile Ala Thr Gly Gly 210215 220 Val Ala Ser Leu Phe Glu Gly Ala Thr Asp Ser Ile Asp His Phe Asp225 230 235 240 Ser Asp Leu Thr Ile Arg Gly Leu Leu Glu Ile Tyr Arg ArgAsn Thr 245 250 255 Ile Ala Glu Ser 260 52 783 DNA Caulobactercrescentus 52 atgctgctgg ccattgagca gggcaacacc aacaccatgt tcgccattcatgatggcgca 60 tcgtgggtcg cgcagtggcg gtcagcgacc gaaagcacgc gcacggccgatgagtacgtc 120 gtctggcttt cgcaactgct gtcgatgcag gggcttggct tccgggcgatcgacgccgtg 180 atcatttcca gcgtcgtgcc gcagtcgatc ttcaatctgc gcaacctgagccgccgctac 240 ttcaacgtcg agcctctggt catcggtgag aacgccaagc tgggcattgatgtccgcatc 300 gagaaaccct ccgaggccgg cgccgaccgc ctggtcaacg ccattggcgcggcgatggtc 360 tatccgggtc cgctggtcgt gatcgacagc ggcaccgcga cgacgttcgacatcgtggcc 420 gccgacggcg ccttcgaggg cgggattatc gcgcccggta tcaacctgtcgatgcaggct 480 ctgcacgagg cggcggcgaa gctgccgcgc atcgccatcc agcgtcccgccggtaacagg 540 atcgtgggca cggacacggt ctccgccatg cagtccggcg tcttctggggctatatttcg 600 ctgatcgaag gcctcgtcgc gcggatcaag gccgagcgcg gcgagcctatgaccgttatc 660 gccacgggtg gcgtcgcctc gctgttcgag ggcgcgaccg acagcattgaccacttcgac 720 tctgatctga cgatccgggg tcttctcgaa atctaccgcc gaaacaccatcgccgagtcc 780 tga 783 53 257 PRT Chlorobium tepidum 53 Met Arg Leu ValVal Asp Ile Gly Asn Thr Ser Thr Thr Leu Ala Ile 1 5 10 15 Phe Thr GlyAsp Glu Glu Pro Ser Val Glu Ser Val Pro Ser Ala Leu 20 25 30 Phe Ala AspSer Ser Thr Met Arg Glu Val Phe Gly Asn Met Ala Arg 35 40 45 Lys His GlyGlu Pro Gln Ala Ile Ala Ile Cys Ser Val Val Pro Ser 50 55 60 Ala Thr AlaVal Gly Ser Ala Leu Leu Glu Ser Leu Phe Ser Val Pro 65 70 75 80 Val LeuThr Ile Cys Cys Lys Leu Arg Phe Pro Phe Arg Leu Asp Tyr 85 90 95 Ala ThrPro His Thr Phe Gly Ala Asp Arg Leu Ala Leu Cys Ala Trp 100 105 110 SerArg His Leu Phe Ser Glu Lys Pro Val Ile Ala Val Asp Ile Gly 115 120 125Thr Ala Ile Thr Phe Asp Val Leu Asp Thr Val Gly Asn Tyr Arg Gly 130 135140 Gly Leu Ile Met Pro Gly Ile Asp Met Met Ala Gly Ala Leu His Ser 145150 155 160 Arg Thr Ala Gln Leu Pro Gln Val Arg Ile Asp Arg Pro Glu SerLeu 165 170 175 Leu Gly Arg Ser Thr Thr Glu Cys Ile Lys Ser Gly Val PheTrp Gly 180 185 190 Val Val Lys Gln Ile Gly Gly Leu Val Asp Ala Ile ArgGly Asp Leu 195 200 205 Val Arg Asp Phe Gly Glu Ser Thr Val Glu Val IleVal Thr Gly Gly 210 215 220 Asn Ser Arg Ile Ile Val Pro Glu Ile Gly ProVal Ser Val Ile Asp 225 230 235 240 Glu Leu Ala Val Leu Arg Gly Ser AspLeu Leu Leu Arg Met Asn Met 245 250 255 Pro 54 774 DNA Chlorobiumtepidum 54 gtgcggctgg tcgttgacat cggcaatacc agcacgacgt tggcgattttcaccggtgat 60 gaagagccgt cggtcgagtc ggtaccgagt gcgttgtttg ccgattccagcacaatgcgc 120 gaagtgtttg gcaacatggc ccggaagcac ggcgagccac aggccatcgccatttgcagc 180 gtggtgcctt ccgctaccgc cgtcggttcg gcgcttctcg aatcacttttctccgtaccg 240 gtgctgacca tctgctgtaa gctccgtttt ccttttcgtc tcgactacgcaaccccgcac 300 accttcggcg cggatcgcct tgccctgtgc gcatggagcc gacatctcttttctgaaaaa 360 ccggttatcg ccgtcgatat cggcacggcc atcaccttcg acgtgctcgacacggtgggg 420 aattatcgcg gtggtctcat catgccgggt atcgacatga tggccggagcgcttcattcg 480 agaaccgccc agcttcccca ggtgcgcatc gacaggccgg agagccttctcgggcgctcg 540 acgaccgaat gcatcaaaag cggagttttc tggggagtgg tcaaacagatcggcggcctc 600 gtggacgcca ttcgcggcga ccttgtacgc gactttggcg agtcaacggtcgaagtgatt 660 gtcaccggcg gcaatagcag gattatcgtt ccggagatcg gccctgtcagtgttatcgac 720 gaactcgctg tcctgcgcgg cagcgatctt ttgctgcgga tgaatatgccgtga 774 55 256 PRT Clostridium difficile 55 Met Leu Leu Val Phe Asp ValGly Asn Thr Asn Met Val Leu Gly Ile 1 5 10 15 Tyr Lys Gly Asp Lys LeuVal Asn Tyr Trp Arg Ile Lys Thr Asp Arg 20 25 30 Glu Lys Thr Ser Asp GluTyr Gly Ile Leu Ile Ser Asn Leu Phe Asp 35 40 45 Tyr Asp Asn Val Asn IleSer Asp Ile Asp Asp Val Ile Ile Ser Ser 50 55 60 Val Val Pro Asn Val MetHis Ser Leu Glu Asn Phe Cys Ile Lys Tyr 65 70 75 80 Cys Lys Lys Gln ProLeu Ile Val Gly Pro Gly Ile Lys Thr Gly Leu 85 90 95 Asn Ile Lys Tyr AspAsn Pro Lys Gln Val Gly Ala Asp Arg Ile Val 100 105 110 Asn Ala Val AlaGly Ile Glu Lys Tyr Gly Ala Pro Ser Ile Leu Val 115 120 125 Asp Phe GlyThr Ala Thr Thr Phe Cys Ala Ile Ser Glu Lys Gly Glu 130 135 140 Tyr LeuGly Gly Thr Ile Ala Pro Gly Ile Lys Ile Ser Ser Glu Ala 145 150 155 160Leu Phe Gln Ser Ala Ser Lys Leu Pro Arg Val Glu Leu Ala Lys Pro 165 170175 Gly Met Thr Ile Cys Lys Ser Thr Val Ser Ala Met Gln Ser Gly Ile 180185 190 Ile Tyr Gly Tyr Val Gly Leu Val Asp Lys Ile Ile Ser Ile Met Lys195 200 205 Lys Glu Leu Asn Cys Asp Asp Val Lys Val Ile Ala Thr Gly GlyLeu 210 215 220 Ala Lys Leu Ile Ala Ser Glu Thr Lys Ser Ile Asp Tyr ValAsp Gly 225 230 235 240 Phe Leu Thr Leu Glu Gly Leu Arg Ile Ile Tyr GluLys Asn Gln Glu 245 250 255 56 771 DNA Clostridium difficile 56atgcttctag tatttgatgt tggaaatact aatatggttt taggtatata taaaggtgac 60aaattagtta attactggag aattaaaaca gatagggaaa aaacgtctga tgaatatgga 120atcctgataa gtaacctatt tgattatgat aatgtgaata taagtgatat tgatgatgtt 180ataatatcat ctgtagttcc gaatgttatg cattctcttg aaaacttttg tataaagtac 240tgtaaaaaac agccattaat agtaggtcca ggcataaaaa caggtctaaa tataaaatat 300gataatccaa aacaagttgg ggcagataga atagttaatg ctgtagcagg gatagaaaag 360tatggagcac caagtatact tgttgatttt ggaacagcaa ctacattttg tgctatctct 420gaaaaaggtg aatatttggg tggaacaata gcaccaggaa taaaaatatc tagtgaggcg 480ttatttcaaa gtgcgtctaa attacctaga gtagaattag ctaagccagg tatgactatt 540tgtaagagta ctgtatcagc catgcaatct ggaataattt atggatatgt tggtttagtt 600gacaaaataa taagtattat gaagaaagaa ttgaattgtg atgatgttaa ggttatagct 660acaggtggat tagctaaact gattgcttca gagacgaaaa gtatagatta tgtagatggt 720tttttaacac tagaaggatt gagaataata tatgaaaaaa accaagaata a 771 57 219 PRTDehalococcoides ethenogenes 57 Met Ser Glu Lys Leu Val Ala Val Asp IleGly Asn Thr Ser Val Asn 1 5 10 15 Ile Gly Ile Phe Glu Gly Glu Lys LeuLeu Ala Asn Trp His Leu Gly 20 25 30 Ser Val Ala Gln Arg Met Ala Asp GluTyr Ala Ser Leu Leu Leu Gly 35 40 45 Leu Leu Gln His Ala Gly Ile His ProGlu Glu Leu Asn Arg Val Ile 50 55 60 Met Cys Ser Val Val Pro Pro Leu ThrThr Thr Phe Glu Glu Val Phe 65 70 75 80 Lys Ser Tyr Phe Lys Ala Ala ProLeu Val Val Gly Ala Gly Ile Lys 85 90 95 Ser Gly Val Lys Val Arg Met AspAsn Pro Arg Glu Val Gly Ala Asp 100 105 110 Arg Ile Val Asn Ala Ala AlaAla Arg Val Leu Tyr Pro Gly Ala Cys 115 120 125 Ile Ile Val Asp Met GlyThr Ala Thr Thr Phe Asp Thr Leu Ser Glu 130 135 140 Gly Gly Ala Tyr IleGly Gly Ala Ile Ala Pro Gly Ile Ala Thr Ser 145 150 155 160 Ala Gln AlaIle Ala Glu Lys Thr Ser Lys Leu Pro Lys Ile Glu Ile 165 170 175 Ile ArgPro Ala Lys Val Ile Gly Ser Asn Thr Val Ser Ala Met Gln 180 185 190 SerGly Ile Tyr Phe Gly Tyr Ile Gly Leu Val Glu Glu Leu Val Arg 195 200 205Arg Ile Gln Thr Glu Leu Gly Gln Lys Thr Arg 210 215 58 659 DNADehalococcoides ethenogenes 58 atgtctgaaa aactggtggc ggtagatatcggcaatacca gcgtaaatat aggtatattt 60 gagggcgaaa aactgctggc aaactggcatctgggttcgg ttgcccagcg tatggctgat 120 gaatatgcca gtctgctctt aggcctgttgcagcacgccg gtatacaccc ggaagagcta 180 aacagggtaa tcatgtgcag tgttgtgccgcccctgacca ctacttttga agaggtattt 240 aaaagctatt tcaaggctgc tcctctggtagtgggtgcag gtataaagag cggggttaag 300 gtgcgcatgg ataacccccg tgaggttggggctgaccgca tagtaaatgc cgctgccgcc 360 agggtgcttt atccgggggc gtgcataatagtggacatgg gtacggccac tacctttgat 420 accctttccg agggtggggc atatataggcggggcgattg cacccggtat tgccacctca 480 gcccaggcta ttgcggaaaa gacttcaaaactgcccaaga ttgagataat ccgtcctgcc 540 aaagttatcg gctctaatac tgtgtcggctatgcagtcag gtatatactt cggttatatc 600 gggctggtgg aagagctggt caggcggattcaaactgaat tggggcagaa aaccagagt 659 59 212 PRT Desulfovibrio vulgaris 59Met Thr Gln His Phe Leu Leu Phe Asp Ile Gly Asn Thr Asn Val Lys 1 5 1015 Ile Gly Ile Ala Val Glu Thr Ala Val Leu Thr Ser Tyr Val Leu Pro 20 2530 Thr Asp Pro Gly Gln Thr Thr Asp Ser Ile Gly Leu Arg Leu Leu Glu 35 4045 Val Leu Arg His Ala Gly Leu Gly Pro Ala Asp Val Gly Ala Cys Val 50 5560 Ala Ser Ser Val Val Pro Gly Val Asn Pro Leu Ile Arg Arg Ala Cys 65 7075 80 Glu Arg Tyr Leu Tyr Arg Lys Leu Leu Phe Ala Pro Gly Asp Ile Ala 8590 95 Ile Pro Leu Asp Asn Arg Tyr Glu Arg Pro Ala Glu Val Gly Ala Asp100 105 110 Arg Leu Val Ala Ala Tyr Ala Ala Arg Arg Leu Tyr Pro Gly ProArg 115 120 125 Ser Leu Val Ser Val Asp Phe Gly Thr Ala Thr Thr Phe AspCys Val 130 135 140 Glu Gly Gly Ala Tyr Leu Gly Gly Leu Ile Cys Pro GlyVal Leu Ser 145 150 155 160 Ser Ala Gly Ala Leu Ser Ser Arg Thr Ala LysLeu Pro Arg Ile Ser 165 170 175 Leu Glu Val Glu Glu Asp Ser Pro Val IleGly Arg Ser Thr Thr Thr 180 185 190 Ser Leu Asn His Gly Phe Ile Phe GlyPhe Ala Ala Met Thr Glu Gly 195 200 205 Val Leu Ala Ala 210 60 639 DNADesulfovibrio vulgaris 60 atgacccagc atttcctgct gttcgacatc ggcaacaccaacgtcaagat cggcatcgcg 60 gtggaaaccg ccgtgctgac ttcgtacgtg ctgcccacagaccccggcca gacgaccgac 120 tccatcgggc tgcgcctgct ggaggtgctg cgccatgccgggctgggacc ggcggacgtg 180 ggggcctgcg tggccagttc ggtggtgccc ggcgtcaacccgctgatccg ccgcgcctgc 240 gaacgttacc tgtatcgcaa gctgctgttc gcccccggcgacatcgccat tccgctggac 300 aaccgctacg aacggcccgc cgaagtgggc gcggaccggctggtggcggc ctatgccgcc 360 cggcggctgt accccggccc ccggtcgctg gtatccgtggatttcggcac cgccaccacg 420 tttgactgcg tggaaggggg tgcgtatctt ggtggtttgatctgtcccgg cgtgctgtcg 480 tccgccgggg cgttgtcgtc gcgcacggcc aagctgccgcgcatcagtct ggaagtggaa 540 gaggattcgc cggtcatcgg gcggtccacc accaccagcctgaaccacgg cttcattttc 600 ggctttgccg ccatgaccga aggggtgctg gccgcctga 63961 249 PRT Pseudomonas putida 61 Met Ile Leu Glu Leu Asp Cys Gly Asn SerPhe Ile Lys Trp Arg Val 1 5 10 15 Ile His Val Ala Asp Ala Val Ile GluGly Gly Gly Ile Val Asp Ser 20 25 30 Asp Gln Ala Leu Val Ala Glu Val AlaAla Leu Ala Ser Val Arg Leu 35 40 45 Thr Gly Cys Arg Ile Val Ser Val ArgSer Glu Glu Glu Thr Asp Ala 50 55 60 Leu Cys Ala Leu Ile Ala Gln Ala PheAla Val Gln Ala Lys Val Ala 65 70 75 80 His Pro Val Arg Glu Met Ala GlyVal Arg Asn Gly Tyr Asp Asp Tyr 85 90 95 Gln Arg Leu Gly Met Asp Arg TrpLeu Ala Ala Leu Gly Ala Phe His 100 105 110 Leu Ala Lys Gly Ala Cys LeuVal Ile Asp Leu Gly Thr Ala Ala Lys 115 120 125 Ala Asp Phe Val Ser AlaAsp Gly Glu His Leu Gly Gly Tyr Ile Cys 130 135 140 Pro Gly Met Pro LeuMet Arg Ser Gln Leu Arg Thr His Thr Arg Arg 145 150 155 160 Ile Arg TyrAsp Asp Ala Ser Ala Glu Arg Ala Leu Ser Ser Leu Ser 165 170 175 Pro GlyArg Ser Thr Val Glu Ala Val Glu Arg Gly Cys Val Leu Met 180 185 190 LeuGln Gly Phe Ala Tyr Thr Gln Leu Glu Gln Ala Arg Val Leu Trp 195 200 205Gly Glu Glu Phe Thr Val Phe Leu Thr Gly Gly Asp Ala Pro Leu Val 210 215220 Arg Ala Ala Leu Pro Gln Ala Arg Val Val Pro Asp Leu Val Phe Val 225230 235 240 Gly Leu Ala Met Ala Cys Pro Leu Asp 245 62 750 DNAPseudomonas putida 62 atgattcttg agctcgattg cggtaacagc ttcatcaagtggcgggtgat ccatgttgcc 60 gatgctgtga ttgaaggtgg tgggatcgtc gattccgatcaggcgctggt ggcggaagtg 120 gctgcgctcg cttcagtgcg tctcacgggt tgccgtattgtcagtgtgcg cagcgaagaa 180 gagaccgatg cgctttgcgc gttgattgct caggcatttgccgtgcaggc gaaggttgcc 240 caccctgtcc gtgaaatggc aggtgtgcgc aatggctatgacgactatca gcgcctgggt 300 atggatcgtt ggctggcggc gttgggggca tttcacctggccaagggcgc gtgcctggtg 360 attgacctgg gtaccgcggc aaaagcggac ttcgtttctgcagatggcga gcatcttggg 420 ggctacatct gcccaggtat gccattgatg cgtagccagctgcgcactca cacccgtcgg 480 atccgctatg acgatgcctc cgcggagcgc gcattgagcagcttgtcacc aggtcgctcg 540 actgtcgaag cggtagagcg cggttgcgta ttgatgctccagggctttgc ctacacccag 600 cttgagcagg ctcgtgtgct atggggtgag gagttcaccgtgttcctcac tggcggtgat 660 gcgccactgg tgagggcggc cctgccacag gcgcgggtcgtgcctgacct ggttttcgtt 720 ggcctggcaa tggcttgtcc attggattga 750 63 241PRT Thiobacillus ferrooxidans 63 Met Ile Phe Ile Ala Val Gly Asn Thr ArgThr Leu Leu Ala His Thr 1 5 10 15 His Asp Gly Val His Phe Asp Ser ValSer Val Ala Thr Ser Leu Pro 20 25 30 Pro Thr Glu Ile Leu Gln Gln Pro GlyLeu Thr Trp Leu Ser Ala Pro 35 40 45 Asn Arg Glu Pro Val Ala Leu Gly GlyVal Val Pro Ala Ala Leu Ala 50 55 60 Ala Trp Arg Glu Ala Leu Ala Thr AlaGlu Val Arg Glu Pro Asp Pro 65 70 75 80 Gly Phe Phe Arg Arg Ala Val ProHis Asp Tyr His Pro Pro Glu Ser 85 90 95 Leu Gly Phe Asp Arg Arg Cys CysLeu Leu Ala Ala Ala Met Asp Tyr 100 105 110 Pro Gly Gln Asp Ser Ile ValIle Asp Met Gly Thr Ala Ile Thr Ile 115 120 125 Asp Leu Leu Ala Gly GlyHis Phe Arg Gly Gly Arg Ile Leu Pro Gly 130 135 140 Ile Ala Met Ser LeuArg Gly Leu His Glu Gly Thr Ala Leu Leu Pro 145 150 155 160 Glu Val ValLeu Asn Ala Pro Ala Glu Met Leu Gly Asn Asp Thr Ser 165 170 175 Asn AlaIle Gln Ala Gly Val Ile His Leu Phe Ala Asp Ala Leu Arg 180 185 190 GlyAla Ile Thr Asp Phe Arg Gln Tyr Ser Pro Gln Ala Arg Ile Leu 195 200 205Ile Thr Gly Gly Asp Ala Glu Arg Trp Gln Pro Gly Ile Ala Gly Ser 210 215220 Leu Tyr Gln Pro His Leu Leu Leu Arg Gly Phe Tyr Leu Trp Ile Arg 225230 235 240 Gly 64 726 DNA Thiobacillus ferrooxidans 64 atgatcttcatcgccgtcgg caatacccgc accctgctgg cacacaccca cgatggcgtg 60 catttcgacagcgtcagcgt ggccacttcg ctgccaccca cggaaatcct gcagcagccc 120 ggcttgacatggctcagcgc gccgaaccgg gaacccgtcg cgctgggcgg cgtcgtacct 180 gcggcgcttgccgcctggcg ggaagccttg gccacggcag aggtccgcga acccgacccc 240 ggcttttttcgccgcgccgt gccgcacgac tatcatccgc cggaaagcct cggctttgac 300 cgccgttgctgcctgctcgc cgccgccatg gactaccccg gccaggacag catcgtcatc 360 gacatgggcaccgccatcac catcgacctg ctggctggcg gacatttccg gggcggacgc 420 attctgccgggtatcgccat gagcctgcgc ggtctgcatg aaggcacggc actccttcct 480 gaagtcgtcctgaacgcccc agcggaaatg ctgggcaatg acaccagcaa cgccattcag 540 gccggggtcatccacctctt tgccgatgcc ctgcgcggcg ccattaccga ctttcgccag 600 tacagcccccaggcacggat actgatcacc ggtggcgatg ccgaacgttg gcaacccggc 660 atcgctggtagcctgtacca gccccatctg cttctgcgcg gcttttatct gtggatacgg 720 ggatga 726 65242 PRT Xylessa fastidiosa 65 Met Asn Asp Trp Leu Phe Asp Leu Gly AsnSer Arg Phe Lys Cys Ala 1 5 10 15 Ser Leu Arg Glu Gly Val Ile Gly ProVal Thr Val Leu Pro Tyr Leu 20 25 30 Thr Glu Thr Met Asp Ala Phe Ala LeuGln Glu Leu Pro Arg Gly Arg 35 40 45 Val Ala Tyr Leu Ala Ser Val Ala AlaPro Ala Ile Thr Thr His Val 50 55 60 Leu Glu Val Leu Lys Ile His Phe GluGln Val Gln Val Ala Ala Thr 65 70 75 80 Val Ala Ala Cys Ala Gly Val ArgIle Ala Tyr Ala His Pro Glu Arg 85 90 95 Phe Gly Val Asp Arg Phe Leu AlaLeu Leu Gly Ser Tyr Gly Glu Gly 100 105 110 Asn Val Leu Val Val Gly ValGly Thr Ala Leu Thr Ile Asp Leu Leu 115 120 125 Ala Ala Asn Gly Cys HisLeu Gly Gly Arg Ile Ser Ala Ser Pro Thr 130 135 140 Leu Met Arg Gln AlaLeu His Ala Arg Ala Glu Gln Leu Pro Leu Ser 145 150 155 160 Gly Gly AsnTyr Leu Glu Phe Ala Glu Asp Thr Glu Asp Ala Leu Val 165 170 175 Ser GlyCys Asn Gly Ala Ala Val Ala Leu Ile Glu Arg Ser Leu Tyr 180 185 190 GluAla His Gln Arg Leu Asp Gln Ser Val Arg Leu Leu Leu His Gly 195 200 205Gly Gly Val Ala Ser Leu Leu Pro Trp Leu Gly Asp Val Val His Arg 210 215220 Pro Thr Leu Val Leu Asp Gly Leu Ala Ile Trp Ala Ala Val Ala Ala 225230 235 240 Asn Val 66 729 DNA Xylella fastidiosa 66 atgaatgattggttattcga tctaggtaat tcgcgtttta aatgtgcatc gctcagggaa 60 ggtgtgattggtcctgtaac ggttttgccg tacttaacag agaccatgga cgcgtttgcg 120 ttacaggagctaccacgtgg tcgtgtggct tacttggcga gtgtcgctgc tccggctatt 180 actacacatgtgctcgaagt attaaaaatc cacttcgagc aagtccaggt ggctgcaacc 240 gtcgctgcatgtgccggagt acgaattgcc tatgctcacc cggaacgttt tggagtggat 300 aggttcttagcgttgcttgg ttcgtatggt gagggcaatg tcctggtagt gggtgtcggg 360 acagcattgactattgattt gttggctgcc aatggttgtc atctcggagg gcgtatcagt 420 gcttcaccgacattgatgcg ccaagcgttg catgcacgcg cggagcaact ccccctcagt 480 ggtgggaactacttggagtt tgcggaagat acagaggatg cgttggtgtc agggtgcaat 540 ggtgcagcggtggcattgat cgaacgtagc ctgtatgagg cacatcaacg tttggaccag 600 tcggttcgattattgttgca tggtggaggt gtagcatctt tattgccttg gttgggcgac 660 gtggtacatcgtcctacatt agtattggat ggcctggcga tctgggctgc cgttgcagct 720 aacgtttag 72967 223 PRT Helicobacter pylori 67 Met Pro Ala Arg Gln Ser Phe Thr AspLeu Lys Asn Leu Val Leu Cys 1 5 10 15 Asp Ile Gly Asn Thr Arg Ile HisPhe Ala Gln Asn Tyr Gln Leu Phe 20 25 30 Ser Ser Ala Lys Glu Asp Leu LysArg Leu Gly Ile Gln Lys Glu Ile 35 40 45 Phe Tyr Ile Ser Val Asn Glu GluAsn Glu Lys Ala Leu Leu Asn Cys 50 55 60 Tyr Pro Asn Ala Lys Asn Ile AlaGly Phe Phe His Leu Glu Thr Asp 65 70 75 80 Tyr Val Gly Leu Gly Ile AspArg Gln Met Ala Cys Leu Ala Val Asn 85 90 95 Asn Gly Val Val Val Asp AlaGly Ser Ala Ile Thr Ile Asp Leu Ile 100 105 110 Lys Glu Gly Lys His LeuGly Gly Cys Ile Leu Pro Gly Leu Ala Gln 115 120 125 Tyr Ile His Ala TyrLys Lys Ser Ala Lys Ile Leu Glu Gln Pro Phe 130 135 140 Lys Ala Leu AspSer Leu Glu Val Leu Pro Lys Ser Thr Arg Asp Ala 145 150 155 160 Val AsnTyr Gly Met Val Leu Ser Val Ile Ala Cys Ile Gln His Leu 165 170 175 AlaLys Asn Gln Lys Ile Tyr Leu Cys Gly Gly Asp Ala Lys Tyr Leu 180 185 190Ser Ala Phe Leu Pro His Ser Val Cys Lys Glu Arg Leu Val Phe Asp 195 200205 Gly Met Glu Ile Ala Leu Lys Lys Ala Gly Ile Leu Glu Cys Lys 210 215220 68 672 DNA Helicobacter pylori 68 atgccagcta ggcaatcttt tacagatttgaaaaacctgg ttttgtgcga tataggcaac 60 acgcgtatcc attttgcaca aaactatcagctcttttcaa gcgctaaaga agatttaaag 120 cgtttgggta ttcaaaagga aattttttacattagcgtga atgaagaaaa tgaaaaagcc 180 cttttgaatt gttaccctaa cgctaaaaatattgcagggt tttttcattt agaaaccgac 240 tatgtagggc ttgggataga ccggcaaatggcgtgtctgg cggtaaataa tggcgtggtg 300 gtggatgccg ggagtgcgat tacgatagatttaatcaaag agggcaagca tttaggaggg 360 tgtattttac ccggtttagc ccaatatattcatgcgtata aaaaaagcgc taaaatttta 420 gagcaacctt tcaaggcctt agattctttagaagttttac ctaaaagcac tagagacgct 480 gtgaattacg gcatggtttt gagcgtcattgcttgtatcc agcatttagc caaaaatcaa 540 aaaatctatc tttgtggggg cgatgcgaagtatttgagcg cgtttttacc ccattctgtt 600 tgcaaggagc gtttggtttt tgacgggatggaaatcgctc ttaaaaaagc agggatacta 660 gaatgcaaat ga 672 69 750 DNAPseudomonas syringae 69 atgattcttg agctcgattg cggcaacagc tttatcaagtggcggataat cacaaagagt 60 tgctcaacgt tggtcagcgg cggagtagtg gactcggacacagccttgct agagtgcctg 120 ggcaatctgt caggcgcagc attcagcgat tgccgtctggtaagcgttcg tagcgcggaa 180 gaaacggcga agctggtttg cgcgctggca gataccttttccattagccc tgtctgtgca 240 gcgccggcgc cagagcttgc cggggtaatc aatggatacgacgattttgc acgcttgggg 300 ctggatcgct ggttggcatt tgtaggggct taccaccttgttaagggtgc ctgcctggtg 360 atcgatctgg gcaccgccat tacgtctgac tttgttgaagcgtcaggaaa gcatctgggt 420 ggtttcatct gtcctggcat gccactgatg cgcaatcagctgcgtaccca cacccgtcgc 480 attcgatatg acgatgcaga ggctgaaaaa gccctggtacgactcgtgcc tggccgtgcg 540 acggccgagg ctgtggagcg aggttgttct ctcatgcttcgcggattcgc aatgactcag 600 atcgagatag ctcgcgaata ctggggggac gactttgctattttcgtgac aggaggcgac 660 gctgtcttgg ttgctgatgt gttaccgggc gctcgcattgtccctgattt ggtattcgtt 720 ggcctggctc tcgcttgccc tttacgttga 750 70 249PRT Pseudomonas syringae 70 Met Ile Leu Glu Leu Asp Cys Gly Asn Ser PheIle Lys Trp Arg Ile 1 5 10 15 Ile Thr Lys Ser Cys Ser Thr Leu Val SerGly Gly Val Val Asp Ser 20 25 30 Asp Thr Ala Leu Leu Glu Cys Leu Gly AsnLeu Ser Gly Ala Ala Phe 35 40 45 Ser Asp Cys Arg Leu Val Ser Val Arg SerAla Glu Glu Thr Ala Lys 50 55 60 Leu Val Cys Ala Leu Ala Asp Thr Phe SerIle Ser Pro Val Cys Ala 65 70 75 80 Ala Pro Ala Pro Glu Leu Ala Gly ValIle Asn Gly Tyr Asp Asp Phe 85 90 95 Ala Arg Leu Gly Leu Asp Arg Trp LeuAla Phe Val Gly Ala Tyr His 100 105 110 Leu Val Lys Gly Ala Cys Leu ValIle Asp Leu Gly Thr Ala Ile Thr 115 120 125 Ser Asp Phe Val Glu Ala SerGly Lys His Leu Gly Gly Phe Ile Cys 130 135 140 Pro Gly Met Pro Leu MetArg Asn Gln Leu Arg Thr His Thr Arg Arg 145 150 155 160 Ile Arg Tyr AspAsp Ala Glu Ala Glu Lys Ala Leu Val Arg Leu Val 165 170 175 Pro Gly ArgAla Thr Ala Glu Ala Val Glu Arg Gly Cys Ser Leu Met 180 185 190 Leu ArgGly Phe Ala Met Thr Gln Ile Glu Ile Ala Arg Glu Tyr Trp 195 200 205 GlyAsp Asp Phe Ala Ile Phe Val Thr Gly Gly Asp Ala Val Leu Val 210 215 220Ala Asp Val Leu Pro Gly Ala Arg Ile Val Pro Asp Leu Val Phe Val 225 230235 240 Gly Leu Ala Leu Ala Cys Pro Leu Arg 245 71 8320 DNA ArtificialSequence Description of Artificial Sequence plasmid, pAN296 71tgcgccgcta cagggcgcgt ccattcgcca ttcaggctgc gcaactgttg ggaagggcga 60tcggtgcggg cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga 120ttaagttggg taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgaa 180ttgtaatacg actcactata gggcgaattg ggcccgacgt cgcatgctgg atgaaaagcc 240gatgaccgct tttcaggtct gtcagcagct ttttcctgct gtatatgaaa aggaattgtt 300tttaacgatg tcagaaacgg caggtcacct tgatgtgttg gaggctgaag aagccatcac 360gtcatattgg gaaggaaata ccgtatactt taaaacaatg aagaggtgaa atgggtgaaa 420catatagcgg gaaaaaggat ttggataacc ggcgcttcag gagggcttgg agaaagaatc 480gcatacttat gcgcggctga aggagcccat gtcctgctgt cggctagacg cgaggatcgt 540ttgatagaaa tcaaaaggaa aataaccgag gaatggagcg gacagtgtga gatttttcct 600ctggatgtcg gccgcctaga ggatatcgcc cgggtccgcg atcagatcgg ctcgattgat 660gtactgatta acaatgcagg cttcggtata tttgaaacgg ttttagactc tacattggat 720gacatgaaag cgatgtttga tgtgaatgtc ttcggcctga tcgcctgtac aaaagcggtg 780cttccgcaaa tgcttgagca aaaaaaggga catatcatca atatcgcctc tcaagcgggg 840aaaatcgcca caccgaagtc tagcctgtat tccgcgacca aacatgccgt gttaggttac 900tcaaacgctt tgcggatgga gctttcggga accggcattt atgtgacaac agtcaacccg 960ggcccgattc agacggactt tttttccatt gctgataaag gcggggacta cgccaaaaat 1020gtcggccgct ggatgcttga tcctgatgac gtggcagctc aaattacagc tgcaattttt 1080acgaaaaagc gggagatcaa tcttccgcgt ttaatgaatg ccggcactaa gctgtatcag 1140ctgtttccag ctcttgtaga aaagctggca ggacgcgcgc tcatgaaaaa ataatgatag 1200aactgcctgt ggtggagtgg cttgtttctc acggggcagt ttttgatagt ggaagggaga 1260gattgttgaa tgtcagttca ttcagaagtc cttcatgctc tgcttaaaga tccgtttatt 1320cagaaactga ttgatgcaga gcctgtattc tgggcaaatt caggcaagaa agaggggcca 1380ttaccccgtg cagatgagtg ggcaaccgag atagcggaag cggaaaaaag aatgcagcgg 1440tttgcacctt acattgccga ggtgtttcct gagacgaaag gcgctaaagg aatcatcgag 1500tctccgcttt ttgaggtgca gcatatgaag ggaaagctgg aagcggcata tcagcagcca 1560tttcccggaa gatggctttt aaagtgcgac catgagcttc cgatttcagg atcgattaaa 1620gcgaggggcg ggatttatga agtgttaaag tatgctgaaa atctcgcgct tcaagaagga 1680atgcttcagg aaaccgatga ttaccgcatc ttacaggaag agcggtttac cgggtttttc 1740tcccgctatt cgattgctgt cggttcgaca ggaaatctag gtttaagcat cggcatcatc 1800ggcgcggcac tcgggtttcg cgtgacagtg catatgtccg ccgatgctaa gcagtggaaa 1860aaggatctcc tccgccaaaa gggagtcact gttatggagt acgaaacaga ttacagtgaa 1920gcggtgaacg aagggagacg gcaggcggaa caagatccat tctgttattt tattgatgat 1980gaacattctc gtcagctgtt cttaggatat gctgttgctg caagccgatt aaaaacacag 2040cttgactgta tgaatataaa gccaagtctt gagacgccct tgtttgtgta tctgccgtgc 2100ggagtcggcg gaggaccggg cggtgtagca tttgggctga agcttttata cggagatgat 2160gttcatgtgt ttttcgcaga accaactcat tcaccttgta tgctgttagg gctttattca 2220ggacttcacg agaagatctc cgtccaggat atcggcctgg ataatcagac ggctgctgac 2280ggacttgccg tagggaggcc gtcaggattt gtcggcaagc tgattgaacc gcttctgagc 2340ggctgttata cggtagagga caatacgctt tatactttgc ttcatatgct ggctgtatct 2400gaagataaat atttagagcc ctctgctctt gctggcatgt tcgggccggt tcagcttttt 2460tcgacagaag agggaaggcg ctatgctcag aaatataaga tggaacatgc cgtacatgtc 2520gtctggggaa cgggaggaag catggttcca aaagatgaaa tggctgcgta taaccgaatc 2580ggtgctgatt tgctaaaaaa acgaaatgga aaataagcag acagtgaaaa ggttttccgt 2640tacaatcttt gtaagggttt taacctacag agagtcaggt gtaaacagtg aaaaataaag 2700aacttaacct acatacttta tatacacagc acaatcggga gtcttctgca gctcgagcaa 2760tagttaccct tattatcaag ataagaaaga aaaggatttt tcgctacgct caaatccttt 2820aaaaaaacac aaaagaccac attttttaat gtggtcttta ttcttcaact aaagcaccca 2880ttagttcaac aaacgaaaat tggataaagt gggatatttt taaaatatat atttatgtta 2940cagtaatatt gacttttaaa aaaggattga ttctaatgaa gaaagcagac aagtaagcct 3000cctaaattca ctttagataa aaatttagga ggcatatcaa atgaacttta ataaaattga 3060tttagacaat tggaagagaa aagagatatt taatcattat ttgaaccaac aaacgacttt 3120tagtataacc acagaaattg atattagtgt tttataccga aacataaaac aagaaggata 3180taaattttac cctgcattta ttttcttagt gacaagggtg ataaactcaa atacagcttt 3240tagaactggt tacaatagcg acggagagtt aggttattgg gataagttag agccacttta 3300tacaattttt gatggtgtat ctaaaacatt ctctggtatt tggactcctg taaagaatga 3360cttcaaagag ttttatgatt tatacctttc tgatgtagag aaatataatg gttcggggaa 3420attgtttccc aaaacaccta tacctgaaaa tgctttttct ctttctatta ttccatggac 3480ttcatttact gggtttaact taaatatcaa taataatagt aattaccttc tacccattat 3540tacagcagga aaattcatta ataaaggtaa ttcaatatat ttaccgctat ctttacaggt 3600acatcattct gtttgtgatg gttatcatgc aggattgttt atgaactcta ttcaggaatt 3660gtcagatagg cctaatgact ggcttttata atatgagata atgccgactg tactttttac 3720agtcggtttt ctaatgtcac taacctgccc cgttagttga agaaggtttt tatattacag 3780ctgtcgacta aggtcgagga agtgttggta aggagggtat gaaatgtgca tcatattgaa 3840ctgtatgtct ctgatttgga ggcgtctagg cggttttggg gctggttctt aaaagaactt 3900ggttataaag agtatcaaaa atggagctca ggcatcagct ggaagaaaga tcgtttttac 3960ctagtgattg tgcaggcgaa agagccattt ctagagccgg aataccatag atgccgagtc 4020ggtctgaacc atctcgcatt tcatgctgaa tccaagcttc aagtcgatca gatgactgaa 4080aaattgacgg caaaaggcta tcgtgtgttg taccgagaca ggcatccttt tgccggagga 4140gacgggcatt atgcagtctt ttgtgaggat ccagaccgga ttaaggtaga gctcgttgcc 4200ccaagctgtt aatcgtgatc ttcggacagg ctgttcagct ttttctcaat gcgatccagc 4260tgcgcttttc ggtttttcgc atacttgaag cctgtaacag ccgcaaagac gacagcggca 4320aatataataa atacaaacag ctgaaacatc acatcaccta tattcatgtt cttcacctca 4380tgtttgcggg agagattcat tctcttccgt tttttattta aagcggcttt tccagacggg 4440aacggtgttt tgtggtctcc attttcattt gccgataggc gaacgctaaa aatggcaggc 4500cgagcagggt aatgccgctc aggacagaaa aaatataaat cggccggcca gcgccaaaca 4560ggtctataca tatccccccg acccaagggc cgatgacgtt tccgagctgt ggaaaaccga 4620ttgccccgaa ataagtgcct tttaatcctg gttttgcaat ctggtctaca tacaaatcca 4680tcatagagaa taaaagcact tcgccgattg taaatgtgat gacaatcatc acaattgatg 4740gaacaccgtg tgatacggtg aaaatggcca tgctgatgct aaccatcaca ttaccgagca 4800tcagagaaca aagcggcgaa aaccgttttg caaaatggac aatgggaaat tgcgtcgcca 4860acacaacgat tgcgtttaat gtcagcatca gcccatacag cttcgttcca ttgccgatca 4920aggggttctg cgccatatac tgagggaatg tggaactgaa ttgtgagtag ccgaaggtgc 4980atagcgtaat gccgaccaaa gcaatggtaa aaagataatc cttttgcgtg accataaacg 5040cttcccgcac gctcatattt cgggactggg ctggtgctga taaggatgga tgttttttaa 5100attggagggc aagcacaatt ccgtatagtc cgtaaatgac tgcaggcacc aaaaagggcg 5160tagtcgattg cgatgagccg aaatataggc caagcacagg tccgaagaca acgccgatat 5220taatagccgc atagcgtaaa ttaaaaacta gcagtctcgt tttttcttct gtcatatcag 5280acaacaaggc ctttgaagcg ggctcaaaca gtgatttgca aagaccgttt aatgcgttta 5340ctacaaaaaa cacccagaga ttagatgctg ccgcaaagcc tgcaaatacc agcatccatc 5400cgaaaatcga tacaagcatc atgttttttc tgccgaattt atctgagata tatccgccgt 5460aaaagcttgc gaggatgccg actgatgagc tcgcggcgat gaccagccct gcataggaag 5520ctgatgcgcc ttggacggct gtcaaataaa tcgctaaaaa aggaatgctc atcgatgttg 5580ccattctgcc gaaaatggtt ccgattataa ttgtaacgcg ttggatgcat agcttgagta 5640ttctatagtg tcacctaaat agcttggcgt aatcatggtc atagctgttt cctgtgtgaa 5700attgttatcc gctcacaatt ccacacaaca tacgagccgg aagcataaag tgtaaagcct 5760ggggtgccta atgagtgagc taactcacat taattgcgtt gcgctcactg cccgctttcc 5820agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg 5880gtttgcgtat tgggcgctct tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc 5940ggctgcggcg agcggtatca gctcactcaa aggcggtaat acggttatcc acagaatcag 6000gggataacgc aggaaagaac atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa 6060aggccgcgtt gctggcgttt ttcgataggc tccgcccccc tgacgagcat cacaaaaatc 6120gacgctcaag tcagaggtgg cgaaacccga caggactata aagataccag gcgtttcccc 6180ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga tacctgtccg 6240cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg tatctcagtt 6300cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc 6360gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc 6420cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc ggtgctacag 6480agttcttgaa gtggtggcct aactacggct acactagaag gacagtattt ggtatctgcg 6540ctctgctgaa gccagttacc ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa 6600ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc agaaaaaaag 6660gatctcaaga agatcctttg atcttttcta cggggtctga cgctcagtgg aacgaaaact 6720cacgttaagg gattttggtc atgagattat caaaaaggat cttcacctag atccttttaa 6780attaaaaatg aagttttaaa tcaatctaaa gtatatatga gtaaacttgg tctgacagtt 6840accaatgctt aatcagtgag gcacctatct cagcgatctg tctatttcgt tcatccatag 6900ttgcctgact ccccgtcgtg tagataacta cgatacggga gggcttacca tctggcccca 6960gtgctgcaat gataccgcga gacccacgct caccggctcc agatttatca gcaataaacc 7020agccagccgg aagggccgag cgcagaagtg gtcctgcaac tttatccgcc tccatccagt 7080ctattaattg ttgccgggaa gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg 7140ttgttggcat tgctacaggc atcgtggtgt cacgctcgtc gtttggtatg gcttcattca 7200gctccggttc ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg 7260ttagctcctt cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca 7320tggttatggc agcactgcat aattctctta ctgtcatgcc atccgtaaga tgcttttctg 7380tgactggtga gtactcaacc aagtcattct gagaataccg cgcccggcga ccgagttgct 7440cttgcccggc gtcaatacgg gataatagtg tatgacatag cagaacttta aaagtgctca 7500tcattggaaa acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca 7560gttcgatgta acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg 7620tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac 7680ggaaatgttg aatactcata ctcttccttt ttcaatatta ttgaagcatt tatcagggtt 7740attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc 7800cgcgcacatt tccccgaaaa gtgccacctg tatgcggtgt gaaataccgc acagatgcgt 7860aaggagaaaa taccgcatca ggcgaaattg taaacgttaa tattttgtta aaattcgcgt 7920taaatatttg ttaaatcagc tcatttttta accaataggc cgaaatcggc aaaatccctt 7980ataaatcaaa agaatagacc gagatagggt tgagtgttgt tccagtttgg aacaagagtc 8040cactattaaa gaacgtggac tccaacgtca aagggcgaaa aaccgtctat cagggcgatg 8100gcccactacg tgaaccatca cccaaatcaa gttttttgcg gtcgaggtgc cgtaaagctc 8160taaatcggaa ccctaaaggg agcccccgat ttagagcttg acggggaaag ccggcgaacg 8220tggcgagaaa ggaagggaag aaagcgaaag gagcgggcgc tagggcgctg gcaagtgtag 8280cggtcacgct gcgcgtaacc accacacccg ccgcgcttaa 8320 72 6688 DNA ArtificialSequence Description of Artificial Sequence plasmid, pAN336 72tgcgccgcta cagggcgcgt ccattcgcca ttcaggctgc gcaactgttg ggaagggcga 60tcggtgcggg cctcttcgct attacgccag ctggcgaaag ggggatgtgc tgcaaggcga 120ttaagttggg taacgccagg gttttcccag tcacgacgtt gtaaaacgac ggccagtgaa 180ttgtaatacg actcactata gggcgaattg ggcccgacgt cgcatgcacc aggcttctca 240ggcgctgact tagaaaacct cttgaatgaa gctgcgcttg tagcggctcg tcaaaacaag 300aaaaaaatcg atgcgcgtga tattgacgaa gcgacggacc gtgtaattgc cggacccgct 360aagaagagcc gcgttatctc caagaaagaa cgcaatatcg tggcttatca cgaaggcgga 420cacaccgtta tcggtctcgt tttagatgag gcagatatgg ttcataaagt aacgattgtt 480cctcggggcc aggctggcgg ttatgctgtt atgctgccaa gagaagaccg ttatttccaa 540acaaagccgg agctgcttga taaaattgtc ggcctcttgg gcggacgtgt tgctgaagag 600attatcttcg gtgaagtcag cacaggggcg cacaatgact tccagcgtgc gacgaatatt 660gcaagacgaa tggttacaga attcggtatg tcagaaaaac tgggaccgtt gcaatttgga 720cagtctcagg gcggtcaggt attcttaggc cgtgatttca acaacgaaca gaactacagt 780gatcaaatcg cttacgaaat tgatcaggaa attcagcgca tcatcaaaga atgttatgag 840cgtgcgaaac aaatcctgac tgaaaatcgt gacaagcttg aattgattgc ccaaacgctt 900ctgaaagttg aaacgcttga cgctgaacaa atcaaacacc ttatcgatca tggaacatta 960cctgagcgta atttctcaga tgatgaaaag aacgatgatg tgaaagtaaa cattctgaca 1020aaaacagaag aaaagaaaga cgatacgaaa gagtaattcg ctttctttct aaaaaaactg 1080ccggctgacg ctggcagttt ttttatgtaa atgattggct cagctgcggc ttttacaatc 1140atccaattct ggtatcgatt tgtttacaaa tgagccgctg atcgtgtatg gtattgtaga 1200atgtttgtaa aaagtaaagt agagaaacta ttcaaaagtg gtgatagagg ttgttactgg 1260ttatcgatgt ggggaacacc ctgcagctcg agtgaaatac cgcacagatg cgtaaggaga 1320aaataccgca tcaggcgata aacccagcga accatttgag gtgataggta agattatacc 1380gaggtatgaa aacgagaatt ggacctttac agaattactc tatgaagcgc catatttaaa 1440aagctaccaa gacgaagagg atgaagagga tgaggaggca gattgccttg aatatattga 1500caatactgat aagataatat atcttttata tagaagatat cgccgtatgt aaggatttca 1560gggggcaagg cataggcagc gcgcttatca atatatctat agaatgggca aagcataaaa 1620acttgcatgg actaatgctt gaaacccagg acaataacct tatagcttgt aaattctatc 1680ataattgtgg tttcaaaatc ggctccgtcg atactatgtt atacgccaac tttcaaaaca 1740actttgaaaa agctgttttc tggtatttaa ggttttagaa tgcaaggaac agtgaattgg 1800agttcgtctt gttataatta gcttcttggg gtatctttaa atactgtaga aaagaggaag 1860gaaataataa atggctaaaa tgagaatatc accggaattg aaaaaactga tcgaaaaata 1920ccgctgcgta aaagatacgg aaggaatgtc tcctgctaag gtatataagc tggtgggaga 1980aaatgaaaac ctatatttaa aaatgacgga cagccggtat aaagggacca cctatgatgt 2040ggaacgggaa aaggacatga tgctatggct ggaaggaaag ctgcctgttc caaaggtcct 2100gcactttgaa cggcatgatg gctggagcaa tctgctcatg agtgaggccg atggcgtcct 2160ttgctcggaa gagtatgaag atgaacaaag ccctgaaaag attatcgagc tgtatgcgga 2220gtgcatcagg ctctttcact ccatcgacat atcggattgt ccctatacga atagcttaga 2280cagccgctta gccgaattgg attacttact gaataacgat ctggccgatg tggattgcga 2340aaactgggaa gaagacactc catttaaaga tccgcgcgag ctgtatgatt ttttaaagac 2400ggaaaagccc gaagaggaac ttgtcttttc ccacggcgac ctgggagaca gcaacatctt 2460tgtgaaagat ggcaaagtaa gtggctttat tgatcttggg agaagcggca gggcggacaa 2520gtggtatgac attgccttct gcgtccggtc gatcagggag gatatcgggg aagaacagta 2580tgtcgagcta ttttttgact tactggggat caagcctgat tgggagaaaa taaaatatta 2640tattttactg gatgaattgt tttagtacct agatttagat gtctaaaaag ctttaactac 2700aagcttttta gacatctaat cttttctgaa gtacatccgc aactgtccat actctgatgt 2760tttatatctt ttctaaaagt tcgctagata ggggtcccga gcgcctacga ggaatttgta 2820tcgccattcg ccattcaggc tgcgcaactg ttgggaaggg cgatcggtgc ggtcgactgg 2880caggcaaaac aggacccaag gtcattgcga caggaggcct ggcgccgctc attgcgaacg 2940aatcagattg tatagacatc gttgatccat tcttaaccct aaaagggctg gaattgattt 3000atgaaagaaa ccgcgtagga agtgtatagg aggtttagta atggattatt tagtaaaagc 3060acttgcgtat gacggaaaag ttcgggctta tgcagcgaga acgactgata tggtaaatga 3120ggggcagaga cgccatggta cgtggccgac agcatccgct gcactaggcc gtacaatgac 3180agcttcactt atgctcggcg ctatgctgaa gggcgatgat aagctgaccg tgaaaatcga 3240gggcggaggt ccgatcggag ctattgtagc tgatgccaat gccaaaggag aagtcagagc 3300ctatgtctct aacccgcaag ttcattttga tttaaatgaa caaggtaagc ttgatgtcag 3360acgtgcggtt ggaacaaacg gaacgttaag tgtcgtaaaa gatttaggtt tgcgcgagtt 3420cttcacagga caagtagaaa tcgtttcagg agaattagga gatgatttta cttactatct 3480tgtgtcatct gagcaggttc cttcatcagt gggcgtaggt gtgctcgtaa atcctgacaa 3540taccattctt gcggcagggg gctttattat tcagctgatg ccgggaacag atgatgaaac 3600aatcacaaaa attgaacagc gtctatctca agtagagccg atttctaagc tcatccaaaa 3660agggctgaca ccagaagaaa ttttagaaga agtcctaggc gagaaacctg agattttgga 3720aacgatgcct gtcagattcc attgcccttg ttcaaaagaa cggttcgaaa cagccatttt 3780aggactaggc aaaaaagaaa ttcaagatat gatagaagaa gatggacaag ccgaagcagt 3840atgccatttt tgtaatgaaa agtacttatt tacaaaagaa gagctggaag ggcttcgtga 3900ccaaactacc cgctaagctc tttagcgggt ttttaatttg agaaaagggg ctgaaagcag 3960gtttgaaatc aagaacaatc tggacgcgtt ggatgcatag cttgagtatt ctatagtgtc 4020acctaaatag cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc 4080tcacaattcc acacaacata cgagccggaa gcataaagtg taaagcctgg ggtgcctaat 4140gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 4200tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 4260ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 4320cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 4380gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 4440tggcgttttt cgataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 4500agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 4560tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 4620cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 4680ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 4740ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 4800ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 4860ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc 4920cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 4980gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 5040atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 5100ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa 5160gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa 5220tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc 5280ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga 5340taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa 5400gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt 5460gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttggcattg 5520ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc 5580aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg 5640gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag 5700cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt 5760actcaaccaa gtcattctga gaataccgcg cccggcgacc gagttgctct tgcccggcgt 5820caatacggga taatagtgta tgacatagca gaactttaaa agtgctcatc attggaaaac 5880gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac 5940ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag 6000caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa 6060tactcatact cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga 6120gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc 6180cccgaaaagt gccacctgta tgcggtgtga aataccgcac agatgcgtaa ggagaaaata 6240ccgcatcagg cgaaattgta aacgttaata ttttgttaaa attcgcgtta aatatttgtt 6300aaatcagctc attttttaac caataggccg aaatcggcaa aatcccttat aaatcaaaag 6360aatagaccga gatagggttg agtgttgttc cagtttggaa caagagtcca ctattaaaga 6420acgtggactc caacgtcaaa gggcgaaaaa ccgtctatca gggcgatggc ccactacgtg 6480aaccatcacc caaatcaagt tttttgcggt cgaggtgccg taaagctcta aatcggaacc 6540ctaaagggag cccccgattt agagcttgac ggggaaagcc ggcgaacgtg gcgagaaagg 6600aagggaagaa agcgaaagga gcgggcgcta gggcgctggc aagtgtagcg gtcacgctgc 6660gcgtaaccac cacacccgcc gcgcttaa 6688 73 9396 DNA Artificial SequenceDescription of Artificial Sequence plasmid, pAN341 and pAN342 73ttgcggccgc ttcgaactgt tataaaaaaa ggatcaattt tgaactctct cccaaagttg 60atcccttaac gatttagaaa tccctttgag aatgtttata tacattcaag gtaaccagcc 120aactaatgac aatgattcct gaaaaaagta ataacaaatt actatacaga taagttgact 180gatcaacttc cataggtaac aacctttgat caagtaaggg tatggataat aaaccaccta 240caattgcaat acctgttccc tctgataaaa agctggtaaa gttaagcaaa ctcattccag 300caccagcttc ctgctgtttc aagctacttg aaacaattgt tgatataact gttttggtga 360acgaaagccc acctaaaaca aatacgatta taattgtcat gaaccatgat gttgtttcta 420aaagaaagga agcagttaaa aagctaacag aaagaaatgt aactccgatg tttaacacgt 480ataaaggacc tcttctatca acaagtatcc caccaatgta gccgaaaata atgacactca 540ttgttccagg gaaaataatt acacttccga tttcggcagt acttagctgg tgaacatctt 600tcatcatata aggaaccata gagacaaacc ctgctactgt tccaaatata attcccccac 660aaagaactcc aatcataaaa ggtatatttt tccctaatcc gggatcaaca aaaggatctg 720ttactttcct gatatgtttt acaaatatca ggaatgacag cacgctaacg ataagaaaag 780aaatgctata tgatgttgta aacaacataa aaaatacaat gcctacagac attagtataa 840ttcctttgat atcaaaatga ccttttatcc ttacttcttt ctttaataat ttcataagaa 900acggaacagt gataattgtt atcataggaa tgagtagaag ataggaccaa tgaatataat 960gggctatcat tccaccaatc gctggaccga ctccttctcc catggctact atcgatccaa 1020taagaccaaa tgctttaccc ctattttcct ttggaatata gcgcgcaact acaaccatta 1080cgagtgctgg aaatgcagct gcaccagccc cttgaataaa acgagccata ataagtaagg 1140aaaagaaaga atggccaaca aacccaatta ccgacccgaa acaatttatt ataattccaa 1200ataggagtaa ccttttgatg cctaattgat cagatagctt tccatataca gctgttccaa 1260tggaaaaggt taacataaag gctgtgttca cccagtttgt actcgcaggt ggtttattaa 1320aatcatttgc aatatcaggt aatgagacgt tcaaaaccat ttcatttaat acgctaaaaa 1380aagataaaat gcaaagccaa attaaaattt ggttgtgtcg taaattcgat tgtgaatagg 1440atgtattcac atttcaccct ccaataatga gggcagacgt agtttatagg gttaatgata 1500cgcttccctc ttttaattga accctgttac attcattaca cttcataatt aattcctcct 1560aaacttgatt aaaacatttt accacatata aactaagttt taaattcagt atttcatcac 1620ttatacaaca atatggcccg tttgttgaac tactctttaa taaaataatt tttccgttcc 1680caattccaca ttgcaataat agaaaatcca tcttcatcgg ctttttcgtc atcatctgta 1740tgaatcaaat cgccttcttc tgtgtcatca aggtttaatt ttttatgtat ttcttttaac 1800aaaccaccat aggagattaa ccttttacgg tgtaaacctt cctccaaatc agacaaacgt 1860ttcaaattct tttcttcatc atcggtcata aaatccgtat cctttacagg atattttgca 1920gtttcgtcaa ttgccgattg tatatccgat ttatatttat ttttcggtcg aatcatttga 1980acttttacat ttggatcata gtctaatttc attgcctttt tccaaaattg aatccattgt 2040ttttgattca cgtagttttc tgtattctta aaataagttg gttccacaca taccaataca 2100tgcatgtgct gattataaga attatcttta ttatttattg tcacttccgt tgcacgcata 2160aaaccaacaa gatttttatt aattttttta tattgcatca ttcggcgaaa tccttgagcc 2220atatctgaca aactcttatt taattcttcg ccatcataaa catttttaac tgttaatgtg 2280agaaacaacc aacgaactgt tggcttttgt ttaataactt cagcaacaac cttttgtgac 2340tgaatgccat gtttcattgc tctcctccag ttgcacattg gacaaagcct ggatttacaa 2400aaccacactc gatacaactt tctttcgcct gtttcacgat tttgtttata ctctaatatt 2460tcagcacaat cttttactct ttcagccttt ttaaattcaa gaatatgcag aagttcaaag 2520taatcaacat tagcgatttt cttttctctc catggggaat tggaattctc agtcgctcca 2580gttgcaaacg attctggata gtttgccgga tttgcgatag aaccaggccc gccgggaata 2640aagagatccg tattccccgc tgaaaactca gggaaaatat cggccgaacg ccaggcattg 2700accatgtctc tgtaccattc atcaagtcca gagccccctc cccatgagtt attgacaaca 2760tcaggagcca tttccgggtg gggatttcct tccgcgtcct ttggtgctaa aacccattca 2820ccagcttcca aaatgtcagc atcagtgccg ccatcttcag agaacgcttt aacagcaatc 2880cattttgcgc caggtgctac accgatttga tttgttccat caggttcaga gcccaccatc 2940gtgcctgtca cgtgggttcc atgagccaaa tcatcataag ggcttgcctc gcctgctacg 3000gcatcatacc agttcatttc attttcaggc tcattaggat tttccggatt atatccgcga 3060tatttctctt ttaatgccgg atgattccat tccaccccgg tatcaatgga cgcaacaacc 3120gtgccagttc catcatatcc aagtgcccaa gcttttgggg catcgatttg gtctacattc 3180cattccacac cgtcagttgc tttaatagct ttctgtgctt ttttcatatt aaatggggag 3240gatgacttaa aaagctgccg tttctcatta ggaagcacct tttccacttc gggaaactgc 3300accacttttt ccataacctc ttttgaggca tgaacagcaa tcccgttcac cacataataa 3360gaatgaattt ggtctgcatt tcctttatct ttctgggtgt tcaagtattt taggacatct 3420tgctgggatt catcggctgt gacttttaaa gatgacacaa cagcagaacg cttttgatat 3480tccgtcttag cggcagacag cttcttcgat ttcgcttttt taacagccgc ttttgccgct 3540ttttctgggt acccctatgt attactagaa aataacatag taaaacggac atcactccgt 3600ttcaatggag gtgatgtccg tttttcatta caacaaatta cttatctatt tgtaatgctg 3660ctcttggacc cgggatccgc aagtgctttt actaaataat ccattactaa acctcctata 3720cacttcctac gcggtttctt tcataaatca attccagccc ttttagggtt aagaatggat 3780caacgatgtc tatacaatct gattcgttcg caatgagcgg cgccaggcct cctgtcgcaa 3840tgaccttggg tcctgttttg cctgccattt cattcgctta acgattcctt ccacttggcc 3900gacatagcca aataaaattc cagattgcat cgcgctaaca gtgttttttc cgataatatt 3960gtcgggccgg gtgatttcga tacgaggaag ctttgctgca cgcgagtaaa gcgcctctgt 4020cgaaattgta atcccagggg caatcgcccc gcccatgtat tgtttgtttt catcaatata 4080gcagtacgtt gtggcggttc cgaaatcgac aacaattaat ggattgccgt acaagtgtat 4140cgcagcgaca gcatttacga ttctgtctgc ccctacttct ttcggattgt catattttat 4200atttaaaccg gttttcatac ctggaccaac aatttgaggc tcgatatgaa agtattttgt 4260gcacattctt tctaacgcaa acatgattgg cggcactact gacgaaataa taatgccatc 4320tatctgttca aacataagcc cggagtgatc aaataaggag cgcaaaatca tcccaaactc 4380atcttctgtt ttatgcctgc ttgtttctat acgccagtga tattctaatt ttccatcatg 4440atatacacca agtacagtat tggtgttccc cacatcgata accagtaaca acctctatca 4500ccacttttga atagtttctc tagaacaggc ggggttgccc ccgcctgtaa ttaaattatt 4560acacaccctg tagggaaagt caataccttt ttgtaaaatt tttacacagc gtggatctct 4620tctagggaca cctctttgta cccctcaagg gagaaatatt ggcggtactg agcacagttt 4680tggttggtgg acagtgaacc atagctgtcg tcaatagcct cgagttatgg cagttggtta 4740aaaggaaaca aaaagaccgt tttcacacaa aacggtcttt ttcgatttct ttttacagtc 4800acagccactt ttgcaaaaac cggacagctt catgccttat aactgctgtt tcggtcgaca 4860agcttcgcga agcggccgca aaattcactg gccgtcgttt tacaacgtcg tgactgggaa 4920aaccctggcg ttacccaact taatcgcctt gcagcacatc cccctttcgc cagctggcgt 4980aatagcgaag aggcccgcac cgatcgccct tcccaacagt tgcgcagcct gaatggcgaa 5040tggcgcctga tgcggtattt tctccttacg catctgtgcg gtatttcaca ccgcatatgg 5100tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagccccg acacccgcca 5160acacccgctg actatgcttg taaaccgttt tgtgaaaaaa tttttaaaat aaaaaagggg 5220acctctaggg tccccaatta attagtaata taatctatta aaggtcattc aaaaggtcat 5280ccaccggatc agcttagtaa agccctcgct agattttaat gcggatgttg cgattacttc 5340gccaactatt gcgataacaa gaaaaagcca gcctttcatg atatatctcc caatttgtgt 5400agggcttatt atgcacgctt aaaaataata aaagcagact tgacctgata gtttggctgt 5460gagcaattat gtgcttagtg catctaacgc ttgagttaag ccgcgccgcg aagcggcgtc 5520ggcttgaacg aattgttaga cattatttgc cgactacctt ggtgatctcg cctttcacgt 5580agtggacaaa ttcttccaac tgatctgcgc gcgaggccaa gcgatcttct tcttgtccaa 5640gataagcctg tctagcttca agtatgacgg gctgatactg ggccggcagg cgctccattg 5700cccagtcggc agcgacatcc ttcggcgcga ttttgccggt tactgcgctg taccaaatgc 5760gggacaacgt aagcactaca tttcgctcat cgccagccca gtcgggcggc gagttccata 5820gcgttaaggt ttcatttagc gcctcaaata gatcctgttc aggaaccgga tcaaagagtt 5880cctccgccgc tggacctacc aaggcaacgc tatgttctct tgcttttgtc agcaagatag 5940ccagatcaat gtcgatcgtg gctggctcga agatacctgc aagaatgtca ttgcgctgcc 6000attctccaaa ttgcagttcg cgcttagctg gataacgcca cggaatgatg tcgtcgtgca 6060caacaatggt gacttctaca gcgcggagaa tctcgctctc tccaggggaa gccgaagttt 6120ccaaaaggtc gttgatcaaa gctcgccgcg ttgtttcatc aagccttacg gtcaccgtaa 6180ccagcaaatc aatatcactg tgtggcttca ggccgccatc cactgcggag ccgtacaaat 6240gtacggccag caacgtcggt tcgagatggc gctcgatgac gccaactacc tctgatagtt 6300gagtcgatac ttcggcgatc accgcttccc tcatgatgtt taactttgtt ttagggcgac 6360tgccctgctg cgtaacatcg ttgctgctcc ataacatcaa acatcgaccc acggcgtaac 6420gcgcttgctg cttggatgcc cgaggcatag actgtacccc aaaaaaacag tcataacaag 6480ccatgaaaac cgccactgcg ccgttaccac cgctgcgttc ggtcaaggtt ctggaccagt 6540tgcgtgagcg catacgctac ttgcattaca gcttacgaac cgaacaggct tatgtccact 6600gggttcgtgc cttcatccgt ttccacggtg tgcgtcaccc ggcaaccttg ggcagcagcg 6660aagtcgaggc atttctgtcc tggctggcga acgagcgcaa ggtttcggtc tccacgcatc 6720gtcaggcatt ggcggccttg ctgttcttct acggcaaggt gctgtgcacg gatctgccct 6780ggcttcagga gatcggaaga cctcggccgt cgcggcgctt gccggtggtg ctgaccccgg 6840atgaagtggt tcgcatcctc ggttttctgg aaggcgagca tcgtttgttc gcccagcttc 6900tgtatggaac gggcatgcgg atcagtgagg gtttgcaact gcgggtcaag gatctggatt 6960tcgatcacgg cacgatcatc gtgcgggagg gcaagggctc caaggatcgg gccttgatgt 7020tacccgagag cttggcaccc agcctgcgcg agcaggggaa ttgatccggt ggatgacctt 7080ttgaatgacc tttaatagat tatattacta attaattggg gaccctagag gtcccctttt 7140ttattttaaa aattttttca caaaacggtt tacaagcata acgggttttg ctgcccgcaa 7200acgggctgtt ctggtgttgc tagtttgtta tcagaatcgc agatccggct tcaggtttgc 7260cggctgaaag cgctatttct tccagaattg ccatgatttt ttccccacgg gaggcgtcac 7320tggctcccgt gttgtcggca gctttgattc gataagcagc atcgcctgtt tcaggctgtc 7380tatgtgtgac tgttgagctg taacaagttg tctcaggtgt tcaatttcat gttctagttg 7440ctttgtttta ctggtttcac ctgttctatt aggtgttaca tgctgttcat ctgttacatt 7500gtcgatctgt tcatggtgaa cagctttaaa tgcaccaaaa actcgtaaaa gctctgatgt 7560atctatcttt tttacaccgt tttcatctgt gcatatggac agttttccct ttgatatcta 7620acggtgaaca gttgttctac ttttgtttgt tagtcttgat gcttcactga tagatacaag 7680agccataaga acctcagatc cttccgtatt tagccagtat gttctctagt gtggttcgtt 7740gtttttgcgt gagccatgag aacgaaccat tgagatcatg cttactttgc atgtcactca 7800aaaattttgc ctcaaaactg gtgagctgaa tttttgcagt taaagcatcg tgtagtgttt 7860ttcttagtcc gttacgtagg taggaatctg atgtaatggt tgttggtatt ttgtcaccat 7920tcatttttat ctggttgttc tcaagttcgg ttacgagatc catttgtcta tctagttcaa 7980cttggaaaat caacgtatca gtcgggcggc ctcgcttatc aaccaccaat ttcatattgc 8040tgtaagtgtt taaatcttta cttattggtt tcaaaaccca ttggttaagc cttttaaact 8100catggtagtt attttcaagc attaacatga acttaaattc atcaaggcta atctctatat 8160ttgccttgtg agttttcttt tgtgttagtt cttttaataa ccactcataa atcctcatag 8220agtatttgtt ttcaaaagac ttaacatgtt ccagattata ttttatgaat ttttttaact 8280ggaaaagata aggcaatatc tcttcactaa aaactaattc taatttttcg cttgagaact 8340tggcatagtt tgtccactgg aaaatctcaa agcctttaac caaaggattc ctgatttcca 8400cagttctcgt catcagctct ctggttgctt tagctaatac accataagca ttttccctac 8460tgatgttcat catctgagcg tattggttat aagtgaacga taccgtccgt tctttccttg 8520tagggttttc aatcgtgggg ttgagtagtg ccacacagca taaaattagc ttggtttcat 8580gctccgttaa gtcatagcga ctaatcgcta gttcatttgc tttgaaaaca actaattcag 8640acatacatct caattggtct aggtgatttt aatcactata ccaattgaga tgggctagtc 8700aatgataatt actagtcctt ttcctttgag ttgtgggtat ctgtaaattc tgctagacct 8760ttgctggaaa acttgtaaat tctgctagac cctctgtaaa ttccgctaga cctttgtgtg 8820ttttttttgt ttatattcaa gtggttataa tttatagaat aaagaaagaa taaaaaaaga 8880taaaaagaat agatcccagc cctgtgtata actcactact ttagtcagtt ccgcagtatt 8940acaaaaggat gtcgcaaacg ctgtttgctc ctctacaaaa cagaccttaa aaccctaaag 9000gcttaagtag caccctcgca agctcgggca aatcgctgaa tattcctttt gtctccgacc 9060atcaggcacc tgagtcgctg tctttttcgt gacattcagt tcgctgcgct cacggctctg 9120gcagtgaatg ggggtaaatg gcactacagg cgccttttat ggattcatgc aaggaaacta 9180cccataatac aagaaaagcc cgtcacgggc ttctcagggc gttttatggc gggtctgcta 9240tgtggtgcta tctgactttt tgctgttcag cagttcctgc cctctgattt tccagtctga 9300ccacttcgga ttatcccgtg acaggtcatt cagactggct aatgcaccca gtaaggcagc 9360ggtatcatca acaggcttac ccgtcttact gtcaac 9396 74 9292 DNA ArtificialSequence Description of Artificial Sequence plasmid, pAN329 and pAN33074 ttgcggccgc ttcgaactgt tataaaaaaa ggatcaattt tgaactctct cccaaagttg 60atcccttaac gatttagaaa tccctttgag aatgtttata tacattcaag gtaaccagcc 120aactaatgac aatgattcct gaaaaaagta ataacaaatt actatacaga taagttgact 180gatcaacttc cataggtaac aacctttgat caagtaaggg tatggataat aaaccaccta 240caattgcaat acctgttccc tctgataaaa agctggtaaa gttaagcaaa ctcattccag 300caccagcttc ctgctgtttc aagctacttg aaacaattgt tgatataact gttttggtga 360acgaaagccc acctaaaaca aatacgatta taattgtcat gaaccatgat gttgtttcta 420aaagaaagga agcagttaaa aagctaacag aaagaaatgt aactccgatg tttaacacgt 480ataaaggacc tcttctatca acaagtatcc caccaatgta gccgaaaata atgacactca 540ttgttccagg gaaaataatt acacttccga tttcggcagt acttagctgg tgaacatctt 600tcatcatata aggaaccata gagacaaacc ctgctactgt tccaaatata attcccccac 660aaagaactcc aatcataaaa ggtatatttt tccctaatcc gggatcaaca aaaggatctg 720ttactttcct gatatgtttt acaaatatca ggaatgacag cacgctaacg ataagaaaag 780aaatgctata tgatgttgta aacaacataa aaaatacaat gcctacagac attagtataa 840ttcctttgat atcaaaatga ccttttatcc ttacttcttt ctttaataat ttcataagaa 900acggaacagt gataattgtt atcataggaa tgagtagaag ataggaccaa tgaatataat 960gggctatcat tccaccaatc gctggaccga ctccttctcc catggctact atcgatccaa 1020taagaccaaa tgctttaccc ctattttcct ttggaatata gcgcgcaact acaaccatta 1080cgagtgctgg aaatgcagct gcaccagccc cttgaataaa acgagccata ataagtaagg 1140aaaagaaaga atggccaaca aacccaatta ccgacccgaa acaatttatt ataattccaa 1200ataggagtaa ccttttgatg cctaattgat cagatagctt tccatataca gctgttccaa 1260tggaaaaggt taacataaag gctgtgttca cccagtttgt actcgcaggt ggtttattaa 1320aatcatttgc aatatcaggt aatgagacgt tcaaaaccat ttcatttaat acgctaaaaa 1380aagataaaat gcaaagccaa attaaaattt ggttgtgtcg taaattcgat tgtgaatagg 1440atgtattcac atttcaccct ccaataatga gggcagacgt agtttatagg gttaatgata 1500cgcttccctc ttttaattga accctgttac attcattaca cttcataatt aattcctcct 1560aaacttgatt aaaacatttt accacatata aactaagttt taaattcagt atttcatcac 1620ttatacaaca atatggcccg tttgttgaac tactctttaa taaaataatt tttccgttcc 1680caattccaca ttgcaataat agaaaatcca tcttcatcgg ctttttcgtc atcatctgta 1740tgaatcaaat cgccttcttc tgtgtcatca aggtttaatt ttttatgtat ttcttttaac 1800aaaccaccat aggagattaa ccttttacgg tgtaaacctt cctccaaatc agacaaacgt 1860ttcaaattct tttcttcatc atcggtcata aaatccgtat cctttacagg atattttgca 1920gtttcgtcaa ttgccgattg tatatccgat ttatatttat ttttcggtcg aatcatttga 1980acttttacat ttggatcata gtctaatttc attgcctttt tccaaaattg aatccattgt 2040ttttgattca cgtagttttc tgtattctta aaataagttg gttccacaca taccaataca 2100tgcatgtgct gattataaga attatcttta ttatttattg tcacttccgt tgcacgcata 2160aaaccaacaa gatttttatt aattttttta tattgcatca ttcggcgaaa tccttgagcc 2220atatctgaca aactcttatt taattcttcg ccatcataaa catttttaac tgttaatgtg 2280agaaacaacc aacgaactgt tggcttttgt ttaataactt cagcaacaac cttttgtgac 2340tgaatgccat gtttcattgc tctcctccag ttgcacattg gacaaagcct ggatttacaa 2400aaccacactc gatacaactt tctttcgcct gtttcacgat tttgtttata ctctaatatt 2460tcagcacaat cttttactct ttcagccttt ttaaattcaa gaatatgcag aagttcaaag 2520taatcaacat tagcgatttt cttttctctc catggggaat tggaattctc agtcgctcca 2580gttgcaaacg attctggata gtttgccgga tttgcgatag aaccaggccc gccgggaata 2640aagagatccg tattccccgc tgaaaactca gggaaaatat cggccgaacg ccaggcattg 2700accatgtctc tgtaccattc atcaagtcca gagccccctc cccatgagtt attgacaaca 2760tcaggagcca tttccgggtg gggatttcct tccgcgtcct ttggtgctaa aacccattca 2820ccagcttcca aaatgtcagc atcagtgccg ccatcttcag agaacgcttt aacagcaatc 2880cattttgcgc caggtgctac accgatttga tttgttccat caggttcaga gcccaccatc 2940gtgcctgtca cgtgggttcc atgagccaaa tcatcataag ggcttgcctc gcctgctacg 3000gcatcatacc agttcatttc attttcaggc tcattaggat tttccggatt atatccgcga 3060tatttctctt ttaatgccgg atgattccat tccaccccgg tatcaatgga cgcaacaacc 3120gtgccagttc catcatatcc aagtgcccaa gcttttgggg catcgatttg gtctacattc 3180cattccacac cgtcagttgc tttaatagct ttctgtgctt ttttcatatt aaatggggag 3240gatgacttaa aaagctgccg tttctcatta ggaagcacct tttccacttc gggaaactgc 3300accacttttt ccataacctc ttttgaggca tgaacagcaa tcccgttcac cacataataa 3360gaatgaattt ggtctgcatt tcctttatct ttctgggtgt tcaagtattt taggacatct 3420tgctgggatt catcggctgt gacttttaaa gatgacacaa cagcagaacg cttttgatat 3480tccgtcttag cggcagacag cttcttcgat ttcgcttttt taacagccgc ttttgccgct 3540ttttctgggt acccctatgt attactagaa aataacatag taaaacggac atcactccgt 3600ttcaatggag gtgatgtccg tttttcatta caacaaatta cttatctatt tgtaatgctg 3660ctcttggacc cgggatccac gatgtctata caatctgatt cgttcgcaat gagcggcgcc 3720aggcctcctg tcgcaatgac cttgggtcct gttttgcctg ccatttcatt cgcttaacga 3780ttccttccac ttggccgaca tagccaaata aaattccaga ttgcatcgcg ctaacagtgt 3840tttttccgat aatattgtcg ggccgggtga tttcgatacg aggaagcttt gctgcacgcg 3900agtaaagcgc ctctgtcgaa attgtaatcc caggggcaat cgccccgccc atgtattgtt 3960tgttttcatc aatatagcag tacgttgtgg cggttccgaa atcgacaaca attaatggat 4020tgccgtacaa gtgtatcgca gcgacagcat ttacgattct gtctgcccct acttctttcg 4080gattgtcata ttttatattt aaaccggttt tcatacctgg accaacaatt tgaggctcga 4140tatgaaagta ttttgtgcac attctttcta acgcaaacat gattggcggc actactgacg 4200aaataataat gccatctatc tgttcaaaca taagcccgga gtgatcaaat aaggagcgca 4260aaatcatccc aaactcatct tctgttttat gcctgcttgt ttctatacgc cagtgatatt 4320ctaattttcc atcatgatat acaccaagta cagtattggt gttccccaca tcgataacca 4380gtaacaacct ctatcaccac ttttgaatag tttctctaga acaggcgggg ttgcccccgc 4440ctgtaattaa attattacac accctgtagg gaaagtcaat acctttttgt aaaattttta 4500cacagcgtgg atctcttcta gggacacctc tttgtacccc tcaagggaga aatattggcg 4560gtactgagca cagttttggt tggtggacag tgaaccatag ctgtcgtcaa tagcctcgag 4620ttatggcagt tggttaaaag gaaacaaaaa gaccgttttc acacaaaacg gtctttttcg 4680atttcttttt acagtcacag ccacttttgc aaaaaccgga cagcttcatg ccttataact 4740gctgtttcgg tcgacaagct tcgcgaagcg gccgcaaaat tcactggccg tcgttttaca 4800acgtcgtgac tgggaaaacc ctggcgttac ccaacttaat cgccttgcag cacatccccc 4860tttcgccagc tggcgtaata gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg 4920cagcctgaat ggcgaatggc gcctgatgcg gtattttctc cttacgcatc tgtgcggtat 4980ttcacaccgc atatggtgca ctctcagtac aatctgctct gatgccgcat agttaagcca 5040gccccgacac ccgccaacac ccgctgacta tgcttgtaaa ccgttttgtg aaaaaatttt 5100taaaataaaa aaggggacct ctagggtccc caattaatta gtaatataat ctattaaagg 5160tcattcaaaa ggtcatccac cggatcagct tagtaaagcc ctcgctagat tttaatgcgg 5220atgttgcgat tacttcgcca actattgcga taacaagaaa aagccagcct ttcatgatat 5280atctcccaat ttgtgtaggg cttattatgc acgcttaaaa ataataaaag cagacttgac 5340ctgatagttt ggctgtgagc aattatgtgc ttagtgcatc taacgcttga gttaagccgc 5400gccgcgaagc ggcgtcggct tgaacgaatt gttagacatt atttgccgac taccttggtg 5460atctcgcctt tcacgtagtg gacaaattct tccaactgat ctgcgcgcga ggccaagcga 5520tcttcttctt gtccaagata agcctgtcta gcttcaagta tgacgggctg atactgggcc 5580ggcaggcgct ccattgccca gtcggcagcg acatccttcg gcgcgatttt gccggttact 5640gcgctgtacc aaatgcggga caacgtaagc actacatttc gctcatcgcc agcccagtcg 5700ggcggcgagt tccatagcgt taaggtttca tttagcgcct caaatagatc ctgttcagga 5760accggatcaa agagttcctc cgccgctgga cctaccaagg caacgctatg ttctcttgct 5820tttgtcagca agatagccag atcaatgtcg atcgtggctg gctcgaagat acctgcaaga 5880atgtcattgc gctgccattc tccaaattgc agttcgcgct tagctggata acgccacgga 5940atgatgtcgt cgtgcacaac aatggtgact tctacagcgc ggagaatctc gctctctcca 6000ggggaagccg aagtttccaa aaggtcgttg atcaaagctc gccgcgttgt ttcatcaagc 6060cttacggtca ccgtaaccag caaatcaata tcactgtgtg gcttcaggcc gccatccact 6120gcggagccgt acaaatgtac ggccagcaac gtcggttcga gatggcgctc gatgacgcca 6180actacctctg atagttgagt cgatacttcg gcgatcaccg cttccctcat gatgtttaac 6240tttgttttag ggcgactgcc ctgctgcgta acatcgttgc tgctccataa catcaaacat 6300cgacccacgg cgtaacgcgc ttgctgcttg gatgcccgag gcatagactg taccccaaaa 6360aaacagtcat aacaagccat gaaaaccgcc actgcgccgt taccaccgct gcgttcggtc 6420aaggttctgg accagttgcg tgagcgcata cgctacttgc attacagctt acgaaccgaa 6480caggcttatg tccactgggt tcgtgccttc atccgtttcc acggtgtgcg tcacccggca 6540accttgggca gcagcgaagt cgaggcattt ctgtcctggc tggcgaacga gcgcaaggtt 6600tcggtctcca cgcatcgtca ggcattggcg gccttgctgt tcttctacgg caaggtgctg 6660tgcacggatc tgccctggct tcaggagatc ggaagacctc ggccgtcgcg gcgcttgccg 6720gtggtgctga ccccggatga agtggttcgc atcctcggtt ttctggaagg cgagcatcgt 6780ttgttcgccc agcttctgta tggaacgggc atgcggatca gtgagggttt gcaactgcgg 6840gtcaaggatc tggatttcga tcacggcacg atcatcgtgc gggagggcaa gggctccaag 6900gatcgggcct tgatgttacc cgagagcttg gcacccagcc tgcgcgagca ggggaattga 6960tccggtggat gaccttttga atgaccttta atagattata ttactaatta attggggacc 7020ctagaggtcc ccttttttat tttaaaaatt ttttcacaaa acggtttaca agcataacgg 7080gttttgctgc ccgcaaacgg gctgttctgg tgttgctagt ttgttatcag aatcgcagat 7140ccggcttcag gtttgccggc tgaaagcgct atttcttcca gaattgccat gattttttcc 7200ccacgggagg cgtcactggc tcccgtgttg tcggcagctt tgattcgata agcagcatcg 7260cctgtttcag gctgtctatg tgtgactgtt gagctgtaac aagttgtctc aggtgttcaa 7320tttcatgttc tagttgcttt gttttactgg tttcacctgt tctattaggt gttacatgct 7380gttcatctgt tacattgtcg atctgttcat ggtgaacagc tttaaatgca ccaaaaactc 7440gtaaaagctc tgatgtatct atctttttta caccgttttc atctgtgcat atggacagtt 7500ttccctttga tatctaacgg tgaacagttg ttctactttt gtttgttagt cttgatgctt 7560cactgataga tacaagagcc ataagaacct cagatccttc cgtatttagc cagtatgttc 7620tctagtgtgg ttcgttgttt ttgcgtgagc catgagaacg aaccattgag atcatgctta 7680ctttgcatgt cactcaaaaa ttttgcctca aaactggtga gctgaatttt tgcagttaaa 7740gcatcgtgta gtgtttttct tagtccgtta cgtaggtagg aatctgatgt aatggttgtt 7800ggtattttgt caccattcat ttttatctgg ttgttctcaa gttcggttac gagatccatt 7860tgtctatcta gttcaacttg gaaaatcaac gtatcagtcg ggcggcctcg cttatcaacc 7920accaatttca tattgctgta agtgtttaaa tctttactta ttggtttcaa aacccattgg 7980ttaagccttt taaactcatg gtagttattt tcaagcatta acatgaactt aaattcatca 8040aggctaatct ctatatttgc cttgtgagtt ttcttttgtg ttagttcttt taataaccac 8100tcataaatcc tcatagagta tttgttttca aaagacttaa catgttccag attatatttt 8160atgaattttt ttaactggaa aagataaggc aatatctctt cactaaaaac taattctaat 8220ttttcgcttg agaacttggc atagtttgtc cactggaaaa tctcaaagcc tttaaccaaa 8280ggattcctga tttccacagt tctcgtcatc agctctctgg ttgctttagc taatacacca 8340taagcatttt ccctactgat gttcatcatc tgagcgtatt ggttataagt gaacgatacc 8400gtccgttctt tccttgtagg gttttcaatc gtggggttga gtagtgccac acagcataaa 8460attagcttgg tttcatgctc cgttaagtca tagcgactaa tcgctagttc atttgctttg 8520aaaacaacta attcagacat acatctcaat tggtctaggt gattttaatc actataccaa 8580ttgagatggg ctagtcaatg ataattacta gtccttttcc tttgagttgt gggtatctgt 8640aaattctgct agacctttgc tggaaaactt gtaaattctg ctagaccctc tgtaaattcc 8700gctagacctt tgtgtgtttt ttttgtttat attcaagtgg ttataattta tagaataaag 8760aaagaataaa aaaagataaa aagaatagat cccagccctg tgtataactc actactttag 8820tcagttccgc agtattacaa aaggatgtcg caaacgctgt ttgctcctct acaaaacaga 8880ccttaaaacc ctaaaggctt aagtagcacc ctcgcaagct cgggcaaatc gctgaatatt 8940ccttttgtct ccgaccatca ggcacctgag tcgctgtctt tttcgtgaca ttcagttcgc 9000tgcgctcacg gctctggcag tgaatggggg taaatggcac tacaggcgcc ttttatggat 9060tcatgcaagg aaactaccca taatacaaga aaagcccgtc acgggcttct cagggcgttt 9120tatggcgggt ctgctatgtg gtgctatctg actttttgct gttcagcagt tcctgccctc 9180tgattttcca gtctgaccac ttcggattat cccgtgacag gtcattcaga ctggctaatg 9240cacccagtaa ggcagcggta tcatcaacag gcttacccgt cttactgtca ac 9292 75 3964DNA Artificial Sequence Description of Artificial Sequence plasmid,pOTP71 75 ccatcgaatg gccagatgat taattcctaa tttttgttga cactctatcattgatagagt 60 tattttacca ctccctatca gtgatagaga aaagtgaaat gaatagttcgacaaaaatct 120 agaaaaggag gaatttaaat gttactggtt atcgatgtgg ggaacaccaatactgtactt 180 ggtgtatatc atgatggaaa attagaatat cactggcgta tagaaacaagcaggcataaa 240 acagaagatg agtttgggat gattttgcgc tccttatttg atcactccgggcttatgttt 300 gaacagatag atggcattat tatttcgtca gtagtgccgc caatcatgtttgcgttagaa 360 agaatgtgca caaaatactt tcatatcgag cctcaaattg ttggtccaggtatgaaaacc 420 ggtttaaata taaaatatga caatccgaaa gaagtagggg cagacagaatcgtaaatgct 480 gtcgctgcga tacacttgta cggcaatcca ttaattgttg tcgatttcggaaccgccaca 540 acgtactgct atattgatga aaacaaacaa tacatgggcg gggcgattgcccctgggatt 600 acaatttcga cagaggcgct ttactcgcgt gcagcaaagc ttcctcgtatcgaaatcacc 660 cggcccgaca atattatcgg aaaaaacact gttagcgcga tgcaatctggaattttattt 720 ggctatgtcg gccaagtgga aggaatcgtt aagcgaatga aatggcaggcaaaacaggac 780 cccaaggtca ttgcgacagg aggcctggcg ccgctcattg cgaacgaatcagattgtata 840 gacatcgttg atccattctt aaccctaaaa gggctggaat tgatttatgaaagaaaccgc 900 gtaggaagtg tataaggatc cctcgaggtc gacctgcagg gggaccatggtctcagcgct 960 tggagccacc cgcagttcga aaaataataa gcttgacctg tgaagtgaaaaatggcgcac 1020 attgtgcgac attttttttg tctgccgttt accgctactg cgtcacggatctccacgcgc 1080 cctgtagcgg cgcattaagc gcggcgggtg tggtggttac gcgcagcgtgaccgctacac 1140 ttgccagcgc cctagcgccc gctcctttcg ctttcttccc ttcctttctcgccacgttcg 1200 ccggctttcc ccgtcaagct ctaaatcggg ggctcccttt agggttccgatttagtgctt 1260 tacggcacct cgaccccaaa aaacttgatt agggtgatgg ttcacgtagtgggccatcgc 1320 cctgatagac ggtttttcgc cctttgacgt tggagtccac gttctttaatagtggactct 1380 tgttccaaac tggaacaaca ctcaacccta tctcggtcta ttcttttgatttataaggga 1440 ttttgccgat ttcggcctat tggttaaaaa atgagctgat ttaacaaaaatttaacgcga 1500 attttaacaa aatattaacg cttacaattt caggtggcac ttttcggggaaatgtgcgcg 1560 gaacccctat ttgtttattt ttctaaatac attcaaatat gtatccgctcatgagacaat 1620 aaccctgata aatgcttcaa taatattgaa aaaggaagag tatgagtattcaacatttcc 1680 gtgtcgccct tattcccttt tttgcggcat tttgccttcc tgtttttgctcacccagaaa 1740 cgctggtgaa agtaaaagat gctgaagatc agttgggtgc acgagtgggttacatcgaac 1800 tggatctcaa cagcggtaag atccttgaga gttttcgccc cgaagaacgttttccaatga 1860 tgagcacttt taaagttctg ctatgtggcg cggtattatc ccgtattgacgccgggcaag 1920 agcaactcgg tcgccgcata cactattctc agaatgactt ggttgagtactcaccagtca 1980 cagaaaagca tcttacggat ggcatgacag taagagaatt atgcagtgctgccataacca 2040 tgagtgataa cactgcggcc aacttacttc tgacaacgat cggaggaccgaaggagctaa 2100 ccgctttttt gcacaacatg ggggatcatg taactcgcct tgatcgttgggaaccggagc 2160 tgaatgaagc cataccaaac gacgagcgtg acaccacgat gcctgtagcaatggcaacaa 2220 cgttgcgcaa actattaact ggcgaactac ttactctagc ttcccggcaacaattgatag 2280 actggatgga ggcggataaa gttgcaggac cacttctgcg ctcggcccttccggctggct 2340 ggtttattgc tgataaatct ggagccggtg agcgtggctc tcgcggtatcattgcagcac 2400 tggggccaga tggtaagccc tcccgtatcg tagttatcta cacgacggggagtcaggcaa 2460 ctatggatga acgaaataga cagatcgctg agataggtgc ctcactgattaagcattggt 2520 aggaattaat gatgtctcgt ttagataaaa gtaaagtgat taacagcgcattagagctgc 2580 ttaatgaggt cggaatcgaa ggtttaacaa cccgtaaact cgcccagaagctaggtgtag 2640 agcagcctac attgtattgg catgtaaaaa ataagcgggc tttgctcgacgccttagcca 2700 ttgagatgtt agataggcac catactcact tttgcccttt agaaggggaaagctggcaag 2760 attttttacg taataacgct aaaagtttta gatgtgcttt actaagtcatcgcgatggag 2820 caaaagtaca tttaggtaca cggcctacag aaaaacagta tgaaactctcgaaaatcaat 2880 tagccttttt atgccaacaa ggtttttcac tagagaatgc attatatgcactcagcgcag 2940 tggggcattt tactttaggt tgcgtattgg aagatcaaga gcatcaagtcgctaaagaag 3000 aaagggaaac acctactact gatagtatgc cgccattatt acgacaagctatcgaattat 3060 ttgatcacca aggtgcagag ccagccttct tattcggcct tgaattgatcatatgcggat 3120 tagaaaaaca acttaaatgt gaaagtgggt cttaaaagca gcataacctttttccgtgat 3180 ggtaacttca ctagtttaaa aggatctagg tgaagatcct ttttgataatctcatgacca 3240 aaatccctta acgtgagttt tcgttccact gagcgtcaga ccccgtagaaaagatcaaag 3300 gatcttcttg agatcctttt tttctgcgcg taatctgctg cttgcaaacaaaaaaaccac 3360 cgctaccagc ggtggtttgt ttgccggatc aagagctacc aactctttttccgaaggtaa 3420 ctggcttcag cagagcgcag ataccaaata ctgtccttct agtgtagccgtagttaggcc 3480 accacttcaa gaactctgta gcaccgccta catacctcgc tctgctaatcctgttaccag 3540 tggctgctgc cagtggcgat aagtcgtgtc ttaccgggtt ggactcaagacgatagttac 3600 cggataaggc gcagcggtcg ggctgaacgg ggggttcgtg cacacagcccagcttggagc 3660 gaacgaccta caccgaactg agatacctac agcgtgagct atgagaaagcgccacgcttc 3720 ccgaagggag aaaggcggac aggtatccgg taagcggcag ggtcggaacaggagagcgca 3780 cgagggagct tccaggggga aacgcctggt atctttatag tcctgtcgggtttcgccacc 3840 tctgacttga gcgtcgattt ttgtgatgct cgtcaggggg gcggagcctatggaaaaacg 3900 ccagcaacgc ggccttttta cggttcctgg ccttttgctg gccttttgctcacatgaccc 3960 gaca 3964 76 3859 DNA Artificial Sequence Description ofArtificial Sequence plasmid, pOTP72 76 ccatcgaatg gccagatgat taattcctaatttttgttga cactctatca ttgatagagt 60 tattttacca ctccctatca gtgatagagaaaagtgaaat gaatagttcg acaaaaatct 120 agaaaaggag gaatttaaat gccagctaggcaatctttta cagatttgaa aaacctggtt 180 ttgtgcgata taggcaacac gcgtatccattttgcacaaa actatcagct cttttcaagc 240 gctaaagaag atttaaagcg tttgggtattcaaaaggaaa ttttttacat tagcgtgaat 300 gaagaaaatg aaaaagccct tttgaattgttaccctaacg ctaaaaatat tgcagggttt 360 tttcatttag aaaccgacta tgtagggcttgggatagacc ggcaaatggc gtgtctggcg 420 gtaaataatg gcgtggtggt ggatgccgggagtgcgatta cgatagattt aatcaaagag 480 ggcaagcatt taggagggtg tattttacccggtttagccc aatatattca tgcgtataaa 540 aaaagcgcta aaattttaga gcaacctttcaaggccttag attctttaga agttttacct 600 aaaagcacta gagacgctgt gaattacggcatggttttga gcgtcattgc ttgtatccag 660 catttagcca aaaatcaaaa aatctatctttgtgggggcg atgcgaagta tttgagcgcg 720 tttttacccc attctgtttg caaggagcgtttggtttttg acgggatgga aatcgctctt 780 aaaaaagcag ggatactaga atgcaaataaggatccctcg aggtcgacct gcagggggac 840 catggtctca gcgcttggag ccacccgcagttcgaaaaat aataagcttg acctgtgaag 900 tgaaaaatgg cgcacattgt gcgacattttttttgtctgc cgtttaccgc tactgcgtca 960 cggatctcca cgcgccctgt agcggcgcattaagcgcggc gggtgtggtg gttacgcgca 1020 gcgtgaccgc tacacttgcc agcgccctagcgcccgctcc tttcgctttc ttcccttcct 1080 ttctcgccac gttcgccggc tttccccgtcaagctctaaa tcgggggctc cctttagggt 1140 tccgatttag tgctttacgg cacctcgaccccaaaaaact tgattagggt gatggttcac 1200 gtagtgggcc atcgccctga tagacggtttttcgcccttt gacgttggag tccacgttct 1260 ttaatagtgg actcttgttc caaactggaacaacactcaa ccctatctcg gtctattctt 1320 ttgatttata agggattttg ccgatttcggcctattggtt aaaaaatgag ctgatttaac 1380 aaaaatttaa cgcgaatttt aacaaaatattaacgcttac aatttcaggt ggcacttttc 1440 ggggaaatgt gcgcggaacc cctatttgtttatttttcta aatacattca aatatgtatc 1500 cgctcatgag acaataaccc tgataaatgcttcaataata ttgaaaaagg aagagtatga 1560 gtattcaaca tttccgtgtc gcccttattcccttttttgc ggcattttgc cttcctgttt 1620 ttgctcaccc agaaacgctg gtgaaagtaaaagatgctga agatcagttg ggtgcacgag 1680 tgggttacat cgaactggat ctcaacagcggtaagatcct tgagagtttt cgccccgaag 1740 aacgttttcc aatgatgagc acttttaaagttctgctatg tggcgcggta ttatcccgta 1800 ttgacgccgg gcaagagcaa ctcggtcgccgcatacacta ttctcagaat gacttggttg 1860 agtactcacc agtcacagaa aagcatcttacggatggcat gacagtaaga gaattatgca 1920 gtgctgccat aaccatgagt gataacactgcggccaactt acttctgaca acgatcggag 1980 gaccgaagga gctaaccgct tttttgcacaacatggggga tcatgtaact cgccttgatc 2040 gttgggaacc ggagctgaat gaagccataccaaacgacga gcgtgacacc acgatgcctg 2100 tagcaatggc aacaacgttg cgcaaactattaactggcga actacttact ctagcttccc 2160 ggcaacaatt gatagactgg atggaggcggataaagttgc aggaccactt ctgcgctcgg 2220 cccttccggc tggctggttt attgctgataaatctggagc cggtgagcgt ggctctcgcg 2280 gtatcattgc agcactgggg ccagatggtaagccctcccg tatcgtagtt atctacacga 2340 cggggagtca ggcaactatg gatgaacgaaatagacagat cgctgagata ggtgcctcac 2400 tgattaagca ttggtaggaa ttaatgatgtctcgtttaga taaaagtaaa gtgattaaca 2460 gcgcattaga gctgcttaat gaggtcggaatcgaaggttt aacaacccgt aaactcgccc 2520 agaagctagg tgtagagcag cctacattgtattggcatgt aaaaaataag cgggctttgc 2580 tcgacgcctt agccattgag atgttagataggcaccatac tcacttttgc cctttagaag 2640 gggaaagctg gcaagatttt ttacgtaataacgctaaaag ttttagatgt gctttactaa 2700 gtcatcgcga tggagcaaaa gtacatttaggtacacggcc tacagaaaaa cagtatgaaa 2760 ctctcgaaaa tcaattagcc tttttatgccaacaaggttt ttcactagag aatgcattat 2820 atgcactcag cgcagtgggg cattttactttaggttgcgt attggaagat caagagcatc 2880 aagtcgctaa agaagaaagg gaaacacctactactgatag tatgccgcca ttattacgac 2940 aagctatcga attatttgat caccaaggtgcagagccagc cttcttattc ggccttgaat 3000 tgatcatatg cggattagaa aaacaacttaaatgtgaaag tgggtcttaa aagcagcata 3060 acctttttcc gtgatggtaa cttcactagtttaaaaggat ctaggtgaag atcctttttg 3120 ataatctcat gaccaaaatc ccttaacgtgagttttcgtt ccactgagcg tcagaccccg 3180 tagaaaagat caaaggatct tcttgagatcctttttttct gcgcgtaatc tgctgcttgc 3240 aaacaaaaaa accaccgcta ccagcggtggtttgtttgcc ggatcaagag ctaccaactc 3300 tttttccgaa ggtaactggc ttcagcagagcgcagatacc aaatactgtc cttctagtgt 3360 agccgtagtt aggccaccac ttcaagaactctgtagcacc gcctacatac ctcgctctgc 3420 taatcctgtt accagtggct gctgccagtggcgataagtc gtgtcttacc gggttggact 3480 caagacgata gttaccggat aaggcgcagcggtcgggctg aacggggggt tcgtgcacac 3540 agcccagctt ggagcgaacg acctacaccgaactgagata cctacagcgt gagctatgag 3600 aaagcgccac gcttcccgaa gggagaaaggcggacaggta tccggtaagc ggcagggtcg 3660 gaacaggaga gcgcacgagg gagcttccagggggaaacgc ctggtatctt tatagtcctg 3720 tcgggtttcg ccacctctga cttgagcgtcgatttttgtg atgctcgtca ggggggcgga 3780 gcctatggaa aaacgccagc aacgcggcctttttacggtt cctggccttt tgctggcctt 3840 ttgctcacat gacccgaca 3859 77 3934DNA Artificial Sequence Description of Artificial Sequence plasmid,pOTP73 77 ccatcgaatg gccagatgat taattcctaa tttttgttga cactctatcattgatagagt 60 tattttacca ctccctatca gtgatagaga aaagtgaaat gaatagttcgacaaaaatct 120 agaaaaggag gaatttaaat gattcttgag ctcgactgtg gaaactcgctgatcaagtgg 180 cgggtcatcg agggggcggc gcggtcggtc gccggtggcc ttgcggagtccgatgatgcc 240 ctggtcgaac agttaacgtc gcagcaagcg ctgccagtgc gagcctgtcgcctggtgagc 300 gttcgcagcg agcaggaaac ctcgcaactg gtcgcacggt tggagcagctgttcccggtt 360 tcggcgctgg ttgcatcatc cggcaagcag ttggcgggtg tgcgcaacggctatctcgat 420 taccagcgcc tggggctcga ccgctggctg gccctcgtcg cggctcatcacctggctaag 480 aaggcctgcc tggtcattga tctggggacc gcggtcacct ctgacctggtcgcggcggat 540 ggagtgcatc tggggggcta catatgcccg ggcatgaccc tgatgagaagccagttgcgc 600 acccataccc gacgtatccg ctacgacgat gcagaggccc ggcgggcgcttgccagtctc 660 cagccagggc aggccacggc cgaggcggtt gagcggggtt gtctgctcatgctcaggggg 720 ttcgttcgtg agcagtacgc catggcgtgc gagctgctcg gtccggattgtgaaatattc 780 ctgacgggtg gggatgccga actggttcgc gacgaactgg ctggcgcccggatcatgccg 840 gacctggttt tcgtagggct ggcactggct tgcccgattg agtaaggatccctcgaggtc 900 gacctgcagg gggaccatgg tctcagcgct tggagccacc cgcagttcgaaaaataataa 960 gcttgacctg tgaagtgaaa aatggcgcac attgtgcgac attttttttgtctgccgttt 1020 accgctactg cgtcacggat ctccacgcgc cctgtagcgg cgcattaagcgcggcgggtg 1080 tggtggttac gcgcagcgtg accgctacac ttgccagcgc cctagcgcccgctcctttcg 1140 ctttcttccc ttcctttctc gccacgttcg ccggctttcc ccgtcaagctctaaatcggg 1200 ggctcccttt agggttccga tttagtgctt tacggcacct cgaccccaaaaaacttgatt 1260 agggtgatgg ttcacgtagt gggccatcgc cctgatagac ggtttttcgccctttgacgt 1320 tggagtccac gttctttaat agtggactct tgttccaaac tggaacaacactcaacccta 1380 tctcggtcta ttcttttgat ttataaggga ttttgccgat ttcggcctattggttaaaaa 1440 atgagctgat ttaacaaaaa tttaacgcga attttaacaa aatattaacgcttacaattt 1500 caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttatttttctaaatac 1560 attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaataatattgaa 1620 aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttttttgcggcat 1680 tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagatgctgaagatc 1740 agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaagatccttgaga 1800 gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctgctatgtggcg 1860 cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcatacactattctc 1920 agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggatggcatgacag 1980 taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggccaacttacttc 2040 tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatgggggatcatg 2100 taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaacgacgagcgtg 2160 acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaactggcgaactac 2220 ttactctagc ttcccggcaa caattgatag actggatgga ggcggataaagttgcaggac 2280 cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatctggagccggtg 2340 agcgtggctc tcgcggtatc attgcagcac tggggccaga tggtaagccctcccgtatcg 2400 tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaatagacagatcgctg 2460 agataggtgc ctcactgatt aagcattggt aggaattaat gatgtctcgtttagataaaa 2520 gtaaagtgat taacagcgca ttagagctgc ttaatgaggt cggaatcgaaggtttaacaa 2580 cccgtaaact cgcccagaag ctaggtgtag agcagcctac attgtattggcatgtaaaaa 2640 ataagcgggc tttgctcgac gccttagcca ttgagatgtt agataggcaccatactcact 2700 tttgcccttt agaaggggaa agctggcaag attttttacg taataacgctaaaagtttta 2760 gatgtgcttt actaagtcat cgcgatggag caaaagtaca tttaggtacacggcctacag 2820 aaaaacagta tgaaactctc gaaaatcaat tagccttttt atgccaacaaggtttttcac 2880 tagagaatgc attatatgca ctcagcgcag tggggcattt tactttaggttgcgtattgg 2940 aagatcaaga gcatcaagtc gctaaagaag aaagggaaac acctactactgatagtatgc 3000 cgccattatt acgacaagct atcgaattat ttgatcacca aggtgcagagccagccttct 3060 tattcggcct tgaattgatc atatgcggat tagaaaaaca acttaaatgtgaaagtgggt 3120 cttaaaagca gcataacctt tttccgtgat ggtaacttca ctagtttaaaaggatctagg 3180 tgaagatcct ttttgataat ctcatgacca aaatccctta acgtgagttttcgttccact 3240 gagcgtcaga ccccgtagaa aagatcaaag gatcttcttg agatcctttttttctgcgcg 3300 taatctgctg cttgcaaaca aaaaaaccac cgctaccagc ggtggtttgtttgccggatc 3360 aagagctacc aactcttttt ccgaaggtaa ctggcttcag cagagcgcagataccaaata 3420 ctgtccttct agtgtagccg tagttaggcc accacttcaa gaactctgtagcaccgccta 3480 catacctcgc tctgctaatc ctgttaccag tggctgctgc cagtggcgataagtcgtgtc 3540 ttaccgggtt ggactcaaga cgatagttac cggataaggc gcagcggtcgggctgaacgg 3600 ggggttcgtg cacacagccc agcttggagc gaacgaccta caccgaactgagatacctac 3660 agcgtgagct atgagaaagc gccacgcttc ccgaagggag aaaggcggacaggtatccgg 3720 taagcggcag ggtcggaaca ggagagcgca cgagggagct tccagggggaaacgcctggt 3780 atctttatag tcctgtcggg tttcgccacc tctgacttga gcgtcgatttttgtgatgct 3840 cgtcaggggg gcggagccta tggaaaaacg ccagcaacgc ggcctttttacggttcctgg 3900 ccttttgctg gccttttgct cacatgaccc gaca 3934

What is claimed:
 1. An assay for the identification an antibioticcomprising: (a) contacting an assay composition comprising a CoaXprotein with a test compound; and (b) determining the ability of thetest compound to inhibit the activity of the CoaX protein; wherein thecompound is identified as an antibiotic based on the ability of thecompound to inhibit the activity of the CoaX protein.
 2. The assay ofclaim 1, wherein the assay composition comprises purified CoaX protein.3. The assay of claim 1, wherein the assay composition comprisespartially purified CoaX protein.
 4. The assay of claim 1, wherein theassay composition comprises crude cell extracts from a cell producingCoaX protein.
 5. The assay of claim 1, wherein the CoaX protein isencoded by a coaX gene derived from a pathogenic bacteria selected fromthe group consisting of Bordetella pertussis, Borrelia burgdorferi,Campylobacter jejuni, Clostridium difficile, Helicobacter pylori,Neisseria meningitidis, Pseudomonas aeruginosa, Treponema pallidum andXylella fastidiosa.
 6. The assay of claim 5, wherein the CoaX proteinhas an amino acid sequence selected from the group consisting of SEQ IDNO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ IDNO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQ ID NO:20, SEQ ID NO:10 and SEQID NO:65.
 7. The assay of claim 1, wherein the CoaX protein is encodedby a coaX gene derived from a pathogenic bacteria selected from thegroup consisting of Bacillus anthracis, Bordetella pertussis, Borreliaburgdorferi, Campylobacter jejuni, Clostridium difficile, Helicobacterpylori, Neisseria meningitidis, Neisseria gonorrhoeae, Porphyromonasgingivalis, Pseudomonas aeruginosa, Treponema pallidum and Xylellafastidiosa.
 8. The assay of claim 7, wherein the CoaX protein has anamino acid sequence selected from the group consisting of SEQ ID NO:45,SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 orSEQ ID NO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQ ID NO:41,SEQ ID NO:20, SEQ ID NO:10 and SEQ ID NO:65.
 9. The assay of claim 1,wherein the CoaX is encoded by a coaX gene derived from a bacteriaselected from the group consisting of Aquifex aeolicus, Bacillusanthracis, Bacillus halodurans, Bacillus stearothermophilus, Bacillussubtilis, Caulobacter crescentus, Chlorobium tepidum, Clostridiumacetobutylicum, Dehalococcoides ethenogenes, Deinococcus radiodurans,Desulfovibrio vulgaris, Geobacter sulfurreducens, Pseudomonas putida,Rhodobacter capsulatus, Thiobacillus ferrooxidans, Streptomycescoelicolor, Synechocystis sp., Thermotoga maritima, Bordetellapertussis, Borrelia burgdorferi, Campylobacter jejuni, Clostridiumdifficile, Helicobacter pylori, Neisseria meningitidis, Neisseriagonorrhoeae, Porphyromonas gingivalis, Pseudomonas aeruginosa, Treponemapallidum, Xylella fastidiosa and Mycobacterium tuberculosis.
 10. Theassay of claim 9, wherein the CoaX protein has an amino acid sequenceselected from the group consisting of SEQ ID NO:12, SEQ ID NO:70, SEQ IDNO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:2, SEQ ID NO:51, SEQ IDNO:53, SEQ ID NO:3, SEQ ID NO:57, SEQ ID NO:8, SEQ ID NO:59, SEQ IDNO:7, SEQ ID NO:61, SEQ ID NO:6, SEQ ID NO:63, SEQ ID NO:4, SEQ IDNO:13, SEQ ID NO:9, SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ IDNO:55, SEQ ID NO:14 or SEQ ID NO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQID NO:39, SEQ ID NO:41, SEQ ID NO:20, SEQ ID NO:10, SEQ ID NO:65 and SEQID NO:5.
 11. The assay of claim 1, wherein said composition is furthercontacted with pantothenate or a pantothenate analog.
 12. The assay ofclaim 11, wherein the ability to modulate activity of CoaX is determinedbased on the ability of the test compound to effect levels ofpantothenate or pantothenate analog in the assay mixture.
 13. An assayfor the identification a potential antibiotic comprising: (a) contactingan assay composition comprising CoaX with a test compound; and (b)determining the ability of the test compound to bind to the CoaX;wherein the compound is identified as a potential antibiotic based onthe ability of the compound to bind to the CoaX.
 14. An assay for theidentification an antibiotic comprising: (a) contacting an assaycomposition comprising CoaX with a test compound; (b) determining theability of the test compound to bind to the CoaX; (c) selecting the testcompound as a potential antibiotic based the ability to bind to theCoaX; and (d) further determining the ability of the selected compoundto inhibit the activity of a CoaX; wherein the compound is identified asa potential antibiotic based on the ability of the compound to bind tothe CoaX.
 15. An assay for the identification a potential antibioticcomprising: (a) contacting an assay composition comprising CoaX with atest compound and pantothenate or a pantothenate analog; and (b)determining the ability of the test compound to modulate binding of thepantothenate or pantothenate analog to the CoaX; wherein the compound isidentified as a potential antibiotic based on the ability of thecompound to modulate binding of the pantothenate or pantothenate analogto the CoaX.
 16. An assay for the identification an antibioticcomprising: (a) contacting an assay composition comprising CoaX with atest compound and pantothenate or a pantothenate analog; (b) determiningthe ability of the test compound to modulate binding of the pantothenateor pantothenate analog to the CoaX; (c) selecting the test compound as apotential antibiotic based the ability to modulate binding of thepantothenate or pantothenate analog to the CoaX; and (d) furtherdetermining the ability of the selected compound to inhibit the activityof a CoaX; wherein the compound is identified as a potential antibioticbased on the ability of the compound to bind to the CoaX.
 17. A methodfor identifying compounds which modulate pantothenate kinase activitycomprising contacting a recombinant cell expressing a singlepantothenate kinase encoded by a coaX gene with a test compound anddetermining the ability of the test compound to modulate pantothenatekinase activity in said cell.
 18. A method for identifying compoundswhich modulate pantothenate kinase activity comprising contacting arecombinant cell expressing a first and second pantothenate kinase, witha test compound and determining the ability of the test compound tomodulate pantothenate kinase activity in said cell, wherein the first orsecond pantothenate kinase has reduced activity.
 19. The method of claim18, wherein said first pantothenate kinase is encoded by a coaA gene andsaid second pantothenate kinase is encoded by a coaX gene.
 20. Themethod of claim 18, wherein said first pantothenate kinase is encoded bya coaX gene and said second pantothenate kinase is encoded by a coaAgene.
 21. The method of claim 18, wherein said recombinant cell is aGram negative microorganism.
 22. The method of claim 18, wherein saidrecombinant cell is a Gram positive microorganism
 23. The method ofclaim 18, wherein the microorganism is of the genus Bacillus orEscherchia.
 24. The method of claim 18, wherein the microorganism isBacillus subtilis or Escherchia coli.
 25. The method of claim 18,wherein determining the ability of the test compound to modulatepantothenate kinase activity in said cell comprises determining theability of the test compound to inhibit pantothenate kinase activity.26. An isolated nucleic acid molecule comprising a coaX gene.
 27. Anisolated pantothenate kinase protein encoded by a coaX gene.
 28. Thepantothenate kinase of claim 27, which is encoded by a coaX gene derivedfrom a pathogenic bacteria selected from the group consisting ofBordetella pertussis, Borrelia burgdorferi, Campylobacter jejuni,Helicobacter pylori, Neisseria meningitidis, Pseudomonas aeruginosa,Treponema pallidum and Xylella fastidiosa.
 29. The pantothenate kinaseof claim 28, having an amino acid sequence selected from the groupconsisting of SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ ID NO:14 orSEQ ID NO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQ ID NO:20, SEQ ID NO:10and SEQ ID NO:65.
 30. The pantothenate kinase of claim 27, which isencoded by a coaX gene derived from a pathogenic bacteria selected fromthe group consisting of Bacillus anthracis, Bordetella pertussis,Borrelia burgdorferi, Campylobacter jejuni, Clostridium difficile,Helicobacter pylori, Neisseria meningitidis, Neisseria gonorrhoeae,Porphyromonas gingivalis, Pseudomonas aeruginosa, Treponema pallidum andXylella fastidiosa.
 31. The pantothenate kinase of claim 30, having anamino acid sequence selected from the group consisting of SEQ ID NO:45,SEQ ID NO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 orSEQ ID NO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQ ID NO:41,SEQ ID NO:20, SEQ ID NO:10 and SEQ ID NO:65.
 32. The pantothenate kinaseof claim 27, which is encoded by a coaX gene derived from a bacteriaselected from the group consisting of Aquifex aeolicus, Bacillusanthracis, Bacillus halodurans, Bacillus stearothermophilus, Bacillussubtilis, Caulobacter crescentus, Chlorobium tepidum, Clostridiumacetobutylicum, Dehalococcoides ethenogenes, Deinococcus radiodurans,Desulfovibrio vulgaris, Geobacter sulfurreducens, Pseudomonas putida,Rhodobacter capsulatus, Thiobacillus ferrooxidans, Streptomycescoelicolor, Synechocystis sp., Thermotoga maritima, Bordetellapertussis, Borrelia burgdorferi, Campylobacter jejuni, Clostridiumdifficile, Helicobacter pylori, Neisseria meningitidis, Neisseriagonorrhoeae, Porphyromonas gingivalis, Pseudomonas aeruginosa, Treponemapallidum, Xylella fastidiosa and Mycobacterium tuberculosis.
 33. Thepantothenate kinase of claim 32, having an amino acid sequence selectedfrom the group consisting of SEQ ID NO:12, SEQ ID NO:45, SEQ ID NO:47,SEQ ID NO:49, SEQ ID NO:2, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:3, SEQID NO:57, SEQ ID NO:8, SEQ ID NO:59, SEQ ID NO:7, SEQ ID NO:61, SEQ IDNO:6, SEQ ID NO:63, SEQ ID NO:4, SEQ ID NO:13, SEQ ID NO:9, SEQ IDNO:15, SEQ ID NO:11, SEQ ID NO:21, SEQ ID NO:55, SEQ ID NO:14 or SEQ IDNO:67, SEQ ID NO:43 or SEQ ID NO:22, SEQ ID NO:39, SEQ ID NO:41, SEQ IDNO:20, SEQ ID NO:10, SEQ ID NO:65 and SEQ ID NO:5
 34. A recombinantvector comprising an isolated coaX gene.
 35. A recombinant microorganismcomprising the vector of claim
 34. 36. A recombinant microorganismselected from the group consisting of PA861, PA876, YH1 comprisingpOTP71, YH1 comprising pOTP72, YH1 comprising pOTP73, and YH1 comprisingpAN341.